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C<l (54) Title: INSECT INHIBITORY BACILLUS THURINGIENSIS PROTEINS, FUSIONS, AND METHODS OF USE THEREFOR 

(57) Abstract: Novel insect inhibitory proteins are disclosed comprising two different components, both of which are required for 
biological activity. Various methods of linking both components together, so that a single protein provides insect inhibitory activity, 
are disclosed. Also disclosed are novel Bacillus thuringiensis nucleic acid sequences encoding Coleopteran-inhibitory crystal pro- 
teins, designated tIClOO (29-kDa) and tlClOl (14-kDa). Also disclosed are methods of making and using nucleic acid sequences in 
the development of the transgenic plant cells containing the novel nucleic acid sequences disclosed herein. 
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INSECT INHIBITORY BACILLUS THURINGIENSIS PROTEINS, FUSIONS, AND 

METHODS OF USE THEREFOR 

CROSS REFERENCE TO RELATED APPLICATION 

This application claims the benefit of priority to US Provisional Application No. 
60/232,099, filed September 12, 2000. 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates generally to the field of molecular biology. More 
particularly, the present invention concerns a new class of insect inhibitory proteins comprising 
two different components, both of which are required for biological activity. The present 
invention concerns the construction of coleopteran-mhibitory crystal proteins, in particular 
CryET33/CryET34 and tIClOO/tlClOl from Bacillus thuringiensis. Various methods of linking 
the proteins together, so that a single protein provides insect inhibitory activity, are disclosed. 
The use of nucleic acid sequences as diagnostic probes and templates for protein synthesis, and 
the use of polypeptides, fusion proteins, antibodies, and peptide fragments in various insect 
inhibitory, immunological, and diagnostic applications are also disclosed, as are methods of 
making and using nucleic acid sequences in the development of transgenic plant cells containing 
the nucleic acid sequences disclosed herein. 

Description of the Related Art 

Environmentally-sensitive methods for controlling or eradicating insect infestation are 
desirable in many instances, in particular when crops of commercial interest are at issue. The 
most widely used environmentally-sensitive insect inhibitory formulations developed in recent 
years have been composed of microbial pest control agents derived from the bacterium Bacillus 
thuringiensis. B. thuringiensis is well known in the art, and is characterized morphologically as 
a Gram-positive bacterium that produces crystal proteins or inclusion bodies which are 
aggregations of proteins specifically active against certain orders and species of insects. Many 
different strains of B. thuringiensis have been shown to produce insect inhibitory crystal 
proteins. Compositions including B. thuringiensis strains which produce insect inhibitory 
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proteins have been commercially available and used as environmentally-acceptable pest control 
agents because they are quite toxic to the specific target insect, but are harmless to plants and 
other non-targeted organisms. 

There are several B.t. crystal protein categories established based on primary structure 
information and the degree of protein similarities to one another. Over the past decade, research 
on the structure and function of B. thuringiensis crystal proteins has covered all of the major 
categories, and while these proteins differ in specific structure and function, general similarities 
in the structure and function are assumed. Based on the accumulated knowledge of B. 
thuringiensis insect inhibitory proteins, a generalized mode of action for B. thuringiensis insect 
inhibitory proteins has been created and includes: ingestion by the insect, solubilization in the 
insect midgut (a combination of stomach and small intestine), resistance to digestive enzymes 
sometimes with partial digestion actually "activating" the insect inhibitory protein, binding to the 
midgut cells, formation of a pore in the insect cells and the disruption of cellular homeostasis 
(English and Slatin, 1992). 

Many of the 6-endotoxins are related to various degrees by similarities in their amino 
acid sequences. Historically, the proteins and the genes which encode them were classified 
based largely upon their spectrum of insect inhibitory activity. The review by Schnepf et aL 
(Microbiol. Mol. Biol. Rev. (1998) 62:775-806) discusses the genes and proteins that were 
identified in B. thuringiensis prior to 1998, and sets forth the most recent nomenclature and 
classification scheme as applied to B. thuringiensis insect inhibitory genes and proteins. Using 
older nomenclature classification schemes, cry\ genes were deemed to encode lepidopteran- 
inhibitory Cryl proteins, cry2 genes were deemed to encode lepidopteran- and dipteran- 
inhibitory Cry2 proteins, crj/3 genes were deemed to encode coleopteran-mhibitory Cry3 
proteins, and cryA genes were deemed to encode d^tera^-inhibitory Cry4 proteins. However, 
new nomenclature systematically classifies the Cry proteins based upon amino acid sequence 
homology rather than upon insect target specificities. The classification scheme for many known 
proteins, not including allelic variations in individual proteins, including dendograms and full 
Bacillus thuringiensis protein lists is summarized and regularly updated at 
http://epunix.biols.susx.ac.iok/Home/Neil_Cricto 
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Most of the nearly 200 B.t crystal proteins presently known have some degree of 
lepidopteran activity associated with them. The large majority of Bacillus thuringiensis insect 
inhibitory proteins which have been identified do not have coleopteran controlling activity. 
Therefore, it is particularly important, at least for commercial purposes, to identify additional 
coleopteran specific insect inhibitory proteins. 

The B.t proteins which have been identified as having. coleopteran-inhibitory activity are 
either related to the Cry3 protein class, or are greater than about 74 kDa in size. (Berhnard, 1986; 
Donovan et al., 1988, 1992; Herrnstadt et al., 1986; Hofte et al., 1987, 1989; Kreig et al., 1983, 
1984, 1987; McPherson et al., 1988; Sekar et al., 1987; Sick et al., 1990; U.S. Pat. No. 
4,766,203; U.S. Pat. No. 4,771,131; U.S. Pat. No. 4,797,279; U.S. Pat. No. 4,910,016; U.S. Pat. 
No. 4,966,155; U.S. Pat. No. 4,966,765; U.S. Pat. No. 4,999,192; U.S. Pat. No. 5,006,336; U.S. 
Pat No. 5,024,837; U.S. Pat. No. 5,055,293; U.S. Patent No. 6,023,013; European Pat. Appl. 
Publ. No. 0318143; Eur. Pat. Appl. Publ. No. 0324254; Eur. Pat. Appl. Publ. No. 0382990; PCT 
Intl. Pat. Appl. Publ. No. WO 90/13651; Intl. Pat. Appl. Publ. No. WO 91/07481). 

U.S. Pat. No. 6,063,756 disclosed Bacillus thuringiensis strains comprising novel crystal 
proteins which exhibit insect inhibitory activity against coleopteran insects including red flour 
beetle larvae (Tribolium castaneuni) and Japanese beetle larvae (Popillia japonica). Also 
disclosed therein are novel B. thuringiensis genes, designated cryET33 and cryET34, which 
encode the coleopteran-irixibitory crystal proteins ET33 and ET34. cryET33 encodes the 
CryET33 (29-kDa) crystal protein, and the cryET34 gene encodes the 14-kDa CryET34 crystal 
protein. Also disclosed therein are methods of making and using transgenic cells comprising the 
novel nucleic acid sequences of the invention. 

Rupar et al. (WO00/066742; PCT/US00/12136) describe still other expression systems 
isolated from Bacillus thuringiensis strains which express proteins, which, when present in 
approximately equimolar concentrations, exhibit Coleopteran insecticidal activity. In particular, 
a binary toxin system referred to as CryET80 and CryET76, ET76 being about 44 kDa and ET80 
being about 14 kDa, are effective in controlling corn rootworms. 

Narva et al. (U.S. Patent Application Serial No. 09/378,088; WO01/14417(A2); 
PCT/US00/22942) disclose yet at least one other coleopteran inhibitory binary toxin exhibiting 
corn rootworm controlling bioactivity, isolated from Bacillus thuringiensis, and describe the 
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construction of a fusion between the two components of the toxin, but failed do demonstrate any 
bioactivity of this fusion. 

It would be useful to provide a protein to plants which exhibits coleopteran-Mnbitory 
activity, which is less than about 74-kDa in size, which is expressed from a single open reading 
frame in order to, at least in plants, ensure simultaneous expression, and in particular in plants, in 
consideration of conservation of the genetic elements, create an easier means for breeding 
purposes. 

SUMMARY OF THE INVENTION 

The present invention discloses novel coleopteran-m^itoxy proteins and fusions of 
these proteins which also surprisingly exhibit insecticidal activity equivalent to the levels of 
activity exhibited by the native proteins, as well as novel nucleic acid sequences which encode 
these proteins. Some of the improvements in the art claimed and disclosed herein include the 
expression of a nucleic acid sequence encoding two-component toxins in planta driven by one 
promoter, wherein said sequence encodes a fusion of the two components which allows for 
conservation of genetic elements and ensures expression of the whole toxin within one cell at the 
same time. Also disclosed are methods of making and using said nucleic acid sequence in the 
development of transgenic plant cells containing the nucleic acid sequences disclosed herein. 

One aspect of the present invention includes the amino acid and nucleic acid sequences as 
set forth in SEQ ID:2 and SEQ ID:4, respectively corresponding to Bacillus thuringiensis 
insecticidal crystal proteins tIClOO and tIClOl. These proteins can be isolated and purified after 
expression from such nucleic acids as those set forth in SEQ ID NO:l and SEQ ID NO:3. 

Another aspect of the present invention includes novel amino acid and nucleic acid 
sequences resulting from the fusion of the CryET33 coding sequence in frame with the CryET34 
coding sequence (SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO: 15, SEQ ID NO: 17), and novel 
amino acid and nucleic acid sequences resulting from the fusion of the CryET34 coding 
sequence in frame with the CryET33 coding sequence .(SEQ ID NO:19, SEQ ID NO:21). The 
present invention also includes novel amino acid and nucleic acid sequences resulting from the 
fusion of the tIClOO coding sequence in frame with the tIClOl coding sequence (SEQ ID NO: 7, 
SEQ ID NO:9), and amino acid and nucleic acid sequences resulting from the fusion of the 
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tlClOl coding sequence in frame with the tIClOO coding sequence (SEQ ID NO:5). Given the 
similarity in size, sequence, and insect inhibitory spectrum activity between the CryET33 and 
tIClOO proteins, as well as between the CryET34 and tIClOl proteins, fusions comprising the 
CryET33 sequence in frame with the tIClOl sequence and the tIClOO sequence in frame with the 
CryET34 sequence are also envisioned. tIClOO and tIClOl are each believed to be novel 
proteins which have been shown to exhibit Coleopteran insecticidal activity when present 
together in a composition in about equimolar ratios. 

Another aspect of the present invention relates to a recombinant vector comprising a 
nucleic acid sequence encoding a CryET33/CryET34, CryET34/CryET33, tIClOO/tlClOl, 
tIClOl/tlClOO, CryET33/ tIClOl, or tIC100/CryET34 fusion protein, wherein the sequence 
encoding the protein is within a single expression cassette and its expression is controlled or 
driven by a single promoter. A recombinant host cell transformed with such a recombinant 
vector, and a biologically pure culture of the recombinant host cell so transformed are also 
exemplified herein. The host cell can be a plant cell or a bacterium, the bacterium preferably 
being a B. thuringiensis bacterium. In addition, a recombinant vector comprising a nucleic acid 
sequence encoding the tIClOO and the tIClOl proteins from within a single operon is also 
disclosed. A recombinant host cell transformed with such a recombinant vector and a 
biologically pure culture of the recombinant host cell so transformed are also exemplified herein. 
The host cell can be a plant cell or a bacterium, the bacterium preferably being a Pseudomonas 
or a B. thuringiensis species of bacterium. 

The present invention discloses an isolated insecticidal polypeptide selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO: 10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:24, and SEQ ID NO:26. The isolated insecticidal polypeptide exhibits 
insecticidal activity when provided in an orally acceptable insect diet to a susceptible 
Coleopteran insect or Coleopteran insect larva. The isolated insecticidal polypeptide exhibits 
insecticidal activity when provided in an orally administrable diet to a susceptible Coleopteran 
insect or Coleopteran insect larva. The isolated insecticidal polypeptide exhibits a preferred 
insect inhibitory activity against a Coleopteran insect, and the preferred Coleopteran insect is a 
cotton boll weevil adult or a cotton boll weevil larva. 
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The insecticidal polypeptide can be formulated into a composition comprising an 
insecticidally effective amount of the polypeptide wherein the composition is a bacterial cell 
which expresses the polypeptide from a polynucleotide sequence that encodes said polypeptide. 
The composition can be any of or a combination of a cell extract, a cell suspension, a cell 
homogenate, a cell lysate, a cell supernatant, a cell filtrate, or a cell pellet. The bacterial cell 
composition is preferably a bacterial cell comprised of a bacterial species selected from the 
species consisting of a Bacillus species, an Escherichia species, a Salmonella species, an 
Agrobacterium species, and a Pseudomonas species of bacterial cell. The more preferable 
bacterial cell composition can be selected from the group of bacterial cells containing a 
recombinant plasmid, the group of bacterial cells being selected from a sIC2000 bacterial cell, a 
sIC2001 bacterial cell, a sIC2002 bacterial cell, a sIC2003 bacterial cell, a sIC2006 bacterial cell, 
a SIC2007 bacterial cell, a sIC2008 bacterial cell, and a sIC2010 bacterial cell. 

The insecticidal composition can be an insecticidally effective amount of any of the 
polypeptides disclosed herein and can be formulated as a powder, dust, pellet, granule, spray, 
emulsion, colloid, or solution. The composition can be prepared by desiccation, lyophilization, 
homogenization, extraction, filtration, centrifugation, sedimentation, or concentration. The 
composition should contain the insecticidal polypeptide present in a concentration of from about 
0.001% to about 99% by weight. 

The present invention also discloses an isolated polynucleotide sequence encoding an 
insecticidal polypeptide, wherein said polynucleotide is selected from the group consisting of 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, 
SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID 
NO:23, and SEQ ID NO:25, and biologically functional equivalents thereof. These 
polynucleotide sequences encode polypeptides which exhibit Coleopteran insecticidal activity 
when provided orally to a susceptible Coleopteran insect or Coleopteran insect larva. These 
polynucleotide sequences encode polypeptides which exhibit Coleopteran insecticidal activity 
when provided in an orally administrable diet or composition to a Coleopteran insect or 
Coleopteran insect larva. These polynucleotide sequences or variants of these sequences which 
encode the polypeptides as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO:18, SEQ 
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ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and SEQ ID NO:26 or functional equivalents of 
these polypeptides are useful for controlling Coleopteran insects, in particular cotton boll weevils 
and cotton boll weevil larvae. A further useful polynucleotide sequence which is disclosed 
herein is a polynucleotide sequence which is or is complementary to one or more of the 
polynucleotide sequences as set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO: 15, SEQ ID NO:17, SEQ ID 
NO:19, SEQ ID NO:21, SEQ ID NO:23, and SEQ ID NO:25 which hybridizes under stringent 
conditions as defined herein to a polynucleotide sequence which is complementary to or which 
encodes a polypeptide selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ 
ID NO: 18, SEQ ID NO:20, SEQ ID NO:22 ? SEQ ID NO:24, and SEQ ID NO:26, and 
biologically functional equivalents thereof. 

Nucleic Acid and Amino Acid Sequences 

The present invention concerns nucleic acid sequences that can be isolated from Bacillus 
thuringiensis strains, or synthesized entirely in vitro using methods that are well-known to those 
of skill in the art. As used herein, the term "nucleic acid sequence" refers to a DNA molecule 
that has been isolated free of total genomic DNA of a particular species. Therefore, a nucleic 
acid sequence encoding a crystal protein or a fusion of crystal proteins refers to a DNA molecule 
that contains crystal protein coding sequences yet is isolated away from, or purified free from, 
total genomic DNA of the species from which the nucleic acid sequence is obtained, which in the 
instant case is the genome of the Gram-positive bacterial genus, Bacillus, and in particular, the 
species of Bacillus known as B. thuringiensis. Also included within the term "nucleic acid 
sequence", are recombinant vectors, including, for example, plasmids, cosmids, phagemids, 
phage, viruses, and the like. 

Similarly, a nucleic acid sequence comprising an isolated or purified crystal protein- 
encoding gene or a nucleic acid sequence encoding a fusion of crystal proteins refers to a nucleic 
acid sequence which may include, in addition to peptide encoding sequences, certain other 
elements such as, regulatory sequences, isolated substantially away from other naturally 
occurring genes or protein-encoding sequences. In this respect, the term "gene" is used for 
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simplicity to refer to a functional protein-, polypeptide- or peptide-encoding unit. As will be 
understood by those in the art, this functional term includes both genomic sequences, operon 
sequences and smaller engineered gene sequences that express, or may be adapted to express, 
proteins, polypeptides or peptides. 

"Isolated substantially away from other coding sequences" means that the gene of 
interest, in this case, a gene encoding a bacterial crystal protein or bacterial crystal protein 
fusion, forms the significant part of the coding region of the nucleic acid sequence, and that the 
nucleic acid sequence does not contain large portions of naturally-occurring coding sequences, 
such as large chromosomal fragments or other functional genes or operon coding regions. Of 
course, this refers to the nucleic acid sequence as originally isolated, and does not exclude genes, 
recombinant genes, synthetic linkers, or coding regions later added to the sequence by the hand 
of man. 

In particular embodiments, the invention comprises isolated nucleic acid sequences 
selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:7, SEQ IDNO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 
NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:28, SEQ 
ID NO:29, SEQ ID NO:30, and SEQ ID NO:31. The invention also is directed to recombinant 
vectors incorporating nucleic acid sequences that encode a protein or fusion protein that includes 
within its amino acid sequence an amino acid sequence comprising SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and SEQ ID NO:26. 

The term "a sequence essentially as set forth in SEQ ID NO:2", for example, means that 
the sequence substantially corresponds to a portion of the sequence of SEQ ID NO:2 and has 
relatively few amino acids that are not identical to, or are not biologically functional equivalents 
of, the amino acids of any of the sequences contemplated herein. The term "biologically 
functional equivalent" is well understood in the art and is further defined in detail herein. 
Accordingly, amino acid sequences that have between about 70% and about 80%, or more 
preferably between about 81% and about 90%, or even more preferably between about 91% and 
about 99% amino acid sequence identity to each other likely are functional equivalents of each 
other if each amino acid sequence exhibits some measurable activity such as insecticidal activity 
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and each amino acid sequence provides comparable measurable activity when present in 
equimolar or substantially identical equimolar amounts. Functional equivalence to the amino 
acid sequences of SEQ ID NO:2 when combined in equimolar ratios with SEQ ID NO:4, for 
example, will be amino acid sequences which are from about 70% to about 80% identical to, or 
more preferably from about 81% to about 90% identical to, or even more preferably from about 
91% to about 99% identical to SEQ ID NO:2 and SEQ ID NO:4 and also exhibit substantially 
the same level of insecticidal activity on a weight to weight basis or a mole to mole basis. 

Nucleic acid sequences can also be functionally equivalent to each other. In this case, a 
first nucleic acid sequence encoding a first peptide can be functionally equivalent to a second 
nucleic acid sequence encoding the same first peptide, primarily because of the redundancy of 
the genetic code. The second nucleic acid sequence can also be functionally equivalent to the 
first nucleic acid sequence if the peptide encoded by the second nucleic acid sequence is 
substantially similar to the first peptide, for example exhibiting from about 70% to about 80% 
identity to, or more preferably from about 81% to about 90% identity to, or even more preferably 
from about 91% to about 99% identity to the first peptide encoded by the first nucleic acid 
sequence, in particular, if the first and the second peptides exhibit substantially the same level of 
measurable activity on a weight to weight basis or on a mole to mole basis. 

The nucleic acid sequences of the present invention encompass sequences encoding 
biologically-functional, equivalent peptides. Such sequences may arise as a consequence of 
codon degeneracy and functional equivalency that are known to occur naturally within nucleic 
acid sequences and the proteins thus encoded. Alternatively, functionally-equivalent proteins or 
peptides may be created via the application of recombinant DNA technology, in which changes 
in the protein structure may be engineered, based on considerations of the properties of the 
amino acids being exchanged. Changes designed by man may be introduced through the 
application of site-directed mutagenesis techniques, e.g., to introduce improvements to the 
antigenicity of the protein or to test mutants in order to examine activity at the molecular level. 

It will also be understood that amino acid and nucleic acid sequences may include 
additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences, and yet 
still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence 
meets the criteria set forth above, including the maintenance of biological protein activity where 
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protein expression is concerned. The addition of terminal sequences particularly applies to 
nucleic acid sequences that may, for example, include various non-coding sequences flanking 
either of the 5 f or 3' portions of the coding region or may include various internal sequences, i.e., 
introns, which are known to occur within genes. 

The nucleic acid sequences of the present invention, regardless of the length of the 
coding sequence itself, may be combined with other nucleic acid sequences, such as promoters, 
polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding 
sequences, and the like, such that their overall length may vary considerably. It is therefore 
contemplated that a nucleic acid sequence of almost any length may be employed, with the total 
length preferably being limited by the ease of preparation and use in the intended recombinant 
DNA protocol. For example, nucleic acid fragments may be prepared that include a short 
contiguous stretch encoding either of the peptide sequences disclosed in SEQ ID NO:2 or SEQ 
ID NO:4, or that are identical to or complementary to nucleic acid sequences which encode any 
of the peptides disclosed in SEQ ID NO:2 or SEQ ID NO:4, and particularly those nucleic acid 
sequences disclosed in SEQ ID NO: 1 or SEQ ID NO:3. For example, nucleic acid sequences 
consisting of from about 14 nucleotides, and up to about 10,000, or to about 5,000, or to about 
3,000, or to about 2,000, or to about 1,000, or to about 500, or to about 200, or to about 100, or 
to about 50, and to about 14 base pairs in length (including all intermediate lengths) are also 
contemplated to be useful. 

It will be readily understood that "intermediate lengths", in these contexts, means any 
length between the quoted ranges, such as 18, 19, 20, 21, 22, 23, etc.; 30, 31, 32, etc.; 50, 51, 52, 
53, etc.; 100, 101, 102, 103, etc.; 150, 151, 152, 153, etc.; including all integers through the 200- 
500; 500-1,000; 1,000-2,000; 2,000-3,000; 3,000-5,000; and up to and including sequences of 
about 5200 nucleotides and the like. 

It will also be understood that this invention is not limited to the particular nucleic acid 
sequences which encode peptides of the present invention, or which encode the amino acid 
sequences of, for example, SEQ ID NO:2 or SEQ ID NO:4, including those nucleic acid 
sequences which are particularly disclosed in SEQ ID NO:l or SEQ ID NO:3. Recombinant 
vectors and isolated nucleic acid sequences may, therefore, variously include the peptide-coding 
regions themselves, coding regions bearing selected alterations or modifications in the basic 
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coding region, or they may encode larger polypeptides that nevertheless include these peptide- 
coding regions or may encode biologically functional equivalent proteins or peptides that have 
variant amino acids sequences. 

If desired, one may also prepare fusion proteins and peptides other than those disclosed 
and claimed herein, e.g., where the peptide-coding regions are aligned within the same 
expression unit with other proteins or peptides having desired functions, such as for purification 
or immunodetection purposes (e.g., proteins that may be purified by affinity chromatography and 
enzyme label coding regions, respectively). 

Recombinant vectors form further aspects of the present invention. Particularly useful 
vectors are contemplated to be those vectors in which the coding portion of the nucleic acid 
sequence, whether encoding a full length protein or smaller peptide, is positioned under the 
control of a promoter. The promoter may be in the form of the promoter that is naturally 
associated with a gene encoding peptides of the present invention, as may be obtained by 
isolating the 5' non-coding sequences located upstream of the coding sequence, for example, 
using recombinant cloning and/or thermal amplification technology, in connection with the 
compositions disclosed herein. 

Nucleic Acid Sequences as Hybridization Probes and Primers 

In addition to their use in directing the expression of crystal fusion proteins or peptides of 
the present invention, the nucleic acid sequences contemplated herein also have a variety of other 
uses. For example, they also have utility as probes or primers in nucleic acid hybridization 
embodiments. As such, it is contemplated that nucleic acid sequences that comprise a sequence 
region that consists of at least a 14 nucleotide long contiguous sequence that has the same 
sequence as, or is complementary to, a 14 nucleotide long contiguous nucleic acid sequence of, 
for example, SEQ ID NO:l or SEQ ID NO:3 will find particular utility. Longer contiguous 
identical or complementary sequences, e.g., those of about 20, 30, 40, 50, 100, 200, 500, 1000, 
2000, 5000 base pairs, etc. (including all intermediate lengths and up to and including the full- 
length sequence of 5200 base pairs) will also be of use in certain embodiments. 

The ability of such nucleic acid probes to specifically hybridize to crystal protein- 
encoding sequences will enable them to be of use in detecting the presence of complementary 
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sequences in a given sample. However, other uses are envisioned, including the use of the 
sequence information for the preparation of mutant species primers, or primers for use in 
preparing other genetic constructions. 

Nucleic acid molecules having sequence regions consisting of contiguous nucleotide 
stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, identical or 
complementary to nucleic acid sequences of, for example, SEQ ID NO:l or SEQ ID NO: 3, are 
particularly contemplated as hybridization probes for use in, e.g., Southern and Northern 
blotting. Smaller fragments will generally find use in hybridization embodiments, wherein the 
length of the contiguous complementary region may be varied, such as between about 10-14 and 
about 100 or 200 nucleotides, but larger contiguous complementarity stretches may be used, 
according to the length complementary sequences one skilled in the art wishes to detect. 

Of course, fragments of nucleic acids may also be obtained by other techniques such as, 
e.g., by mechanical shearing or by restriction enzyme digestion. Small nucleic acid sequences or 
fragments may be readily prepared by, for example, directly synthesizing the fragment by 
chemical means, as is commonly practiced using an automated oligonucleotide synthesizer. 
Also, fragments may be obtained by application of nucleic acid reproduction technology, such as 
the thermal amplification technology of U.S. Pat. Nos. 4,683,195 and 4,683,202, by introducing 
selected sequences into recombinant vectors for recombinant production, and by other 
recombinant DNA techniques generally known to those of skill in the art of molecular biology. 

Accordingly, the nucleotide sequences of the invention may be used for their ability to 
selectively form duplex molecules with complementary stretches of DNA fragments. Depending 
on the application envisioned, one will desire to employ varying conditions of hybridization to 
achieve varying degrees of selectivity of probe towards target sequence. For applications 
requiring high selectivity, one will typically desire to employ relatively stringent conditions to 
form the hybrids, e.g., one will select relatively low salt and/or high temperature conditions, such 
as provided by about 0.02 M to about 0.15 M NaCl at temperatures of about 50°C to about 70°C. 
Such selective conditions tolerate little, if any, mismatch between the probe and the template or 
target strand, and would be particularly suitable for isolating crystal protein-encoding DNA 
sequences. Detection of DNA sequences via hybridization is well-known to those of skill in the 
art, and the teachings of U.S. Pat. Nos. 4,965,188 and 5,176,995 are exemplary of the methods of 
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hybridization analyses. Teachings such as those found in the texts of Maloy et al. ? 1993; Segal 
1976; Prokop, 1991; and Kuby, 1991, are particularly relevant. 

Of course, for some applications, for example, where one desires to prepare mutants 
employing a mutant primer strand hybridized to an underlying template or where one seeks to 
isolate crystal protein-encoding sequences from related species, functional equivalents, or the 
like, less stringent hybridization conditions will typically be needed in order to allow formation 
of the heteroduplex. In these circumstances, one may desire to employ conditions such as about 
0.15 M to about 0.9 M salt, at temperatures ranging from about 20°C to about 55°C. Cross- 
hybridizing species can thereby be readily identified as positively hybridizing signals with 
respect to control hybridizations. In any case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing amounts of formamide, which serves to 
destabilize the hybrid duplex in the same manner as increased temperature. Thus, hybridization 
conditions can be readily manipulated, and thus will generally be a method of choice depending 
on the desired results. 

In certain embodiments, it will be advantageous to employ nucleic acid sequences of the 
present invention in combination with an appropriate means, such as a label, for determining 
hybridization. A wide variety of appropriate indicator means are known in the art, including 
fluorescent, radioactive, enzymatic or other ligands, such as avid/biotin, which are capable of 
giving a detectable signal. In preferred embodiments, one will likely desire to employ a 
fluorescent label such as fluorescein or related molecules, or an enzyme tag such as urease, 
jellyfish green fluorescent protein or variants thereof, alkaline phosphatase, or peroxidase, 
instead of radioactive or other environmentally undesirable reagents. In the case of enzyme tags, 
calorimetric indicator substrates are known that can be employed to provide a means visible to 
the human eye or spectrophotometrically, to identify specific hybridization with complementary 
nucleic acid-containing samples. 

In general, it is envisioned that the hybridization probes described herein will be useful 
both as reagents in solution hybridization as well as in embodiments employing a solid phase. In 
embodiments involving a solid phase, the test DNA (or RNA) is adsorbed or otherwise affixed to 
a selected matrix or surface. This fixed, single-stranded nucleic acid is then subjected to specific 
hybridization with selected probes under desired conditions. The selected conditions will depend 
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on the particular circumstances based on the particular criteria required (depending, for example, 
on the G+C content, type of target nucleic acid, source of nucleic acid, size of hybridization 
probe, etc.). Following washing of the hybridized surface so as to remove nonspecifically bound 
probe molecules, specific hybridization is detected, or even quantitated, by means of the 
incorporated label. 

Recombinant Vectors and Crystal Protein Expression 

In other embodiments, it is contemplated that certain advantages will be gained by 
positioning the coding DNA sequence under the control of a recombinant, or heterologous, 
promoter. As used herein, a recombinant or heterologous promoter is intended to refer to a 
promoter that is not normally associated with a DNA sequence encoding a crystal protein or 
peptide in its natural environment. Such promoters may include promoters normally associated 
with other genes, and/or promoters isolated from any bacterial, viral, eukaryotic, or plant cell. 
Naturally, it will be important to employ a promoter that effectively directs the expression of the 
DNA sequence in the cell type, organism, or even animal, chosen for expression. Those of skill 
in the art of molecular biology generally know the use of promoter and cell type combinations 
for protein expression, for example, see Sambrook et al., 1989. The promoters employed may be 
constitutive, or inducible, and can be used under the appropriate conditions to direct high level 
expression of the introduced DNA sequence, such as is advantageous in the large-scale 
production of recombinant proteins or peptides. Appropriate promoter systems contemplated for 
use in high-level expression include, but are not limited to, the Pichia expression vector system 
(Pharmacia LKB Biotechnology). 

In connection with expression embodiments to prepare recombinant proteins and 
peptides, it is contemplated that longer DNA sequences will most often be used, with DNA 
sequences encoding the entire peptide sequence being most preferred. However, it will be 
appreciated that the use of shorter DNA sequences to direct the expression of crystal peptides or 
epitopic core regions, such as may be used to generate anti-crystal protein antibodies, also falls 
within the scope of the invention. DNA sequences that encode peptide antigens from about 8 to 
about 50 amino acids in length, or more preferably, from about 8 to about 30 amino acids in 
length, or even more preferably, from about 8 to about 20 amino acids in length are contemplated 
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to be particularly useful. Such peptide epitopes may be amino acid sequences which comprise 
contiguous amino acid sequences from, for example, SEQ ID NO:2 or SEQ ID NO:4. 

Crystal Protein Transgenes and Transgenic Plants 

In yet another aspect, the present invention provides methods for producing a transgenic 
plant which expresses a nucleic acid sequence encoding one of the novel crystal proteins of the 
present invention. The process of producing transgenic plants is well-known in the art. For 
example, the method comprises, in general, transforming a suitable host cell with a DNA 
sequence which contains a promoter operatively linked to a coding region that encodes, for 
example, a B. thuringiensis CryET33/CryET34 crystal fusion protein, or for example, a B. 
thuringiensis CrytlClOO or CrytlClOl crystal protein, or combinations of thereof. Such a coding 
region is generally operatively linked to a transcription-terminating region, whereby the 
promoter is capable of driving the transcription of the coding region in the cell, and hence 
providing the cell the ability to produce the recombinant protein in vivo. Alternatively, in 
instances where there is a desire to control, regulate, or decrease the amount of a particular 
recombinant crystal protein expressed in a particular transgenic cell, the invention also provides 
for the expression of crystal protein antisense mRNA. The use of antisense mRNA as a means of 
controlling or decreasing the amount of a given protein of interest in a cell is well-known in the 
art. 

Further embodiments disclosed herein include expression of the proteins tIClOO and 
tIClOl (SEQ ID NO:2 and SEQ ID NO:4, respectively) in a plant, alone or in combination. For 
example, tIClOO cold be expressed in one plant from an expression cassette which is linked 
physically to a second cassette expressing tIClOl so that both proteins are expressed in the same 
plant. Each protein could be expressed in a plant from separate promoters but the coding 
sequences of each protein being physically linked, for example, on the same chromosome. 
Alternatively, each protein could be expressed in a plant from separate promoters but the coding 
sequences of each protein are not physically linked, for example, but the expression cassettes 
containing the promoter operably linked to the coding sequence are instead present in the same 
plant cell but on different chromosomes, so that Mendelian segregation can be achieved if 
desired. Alternatively, these proteins could be expressed from gene sequences transformed into 
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the chloroplast genome, or from autonomously replicating epigenetic elements present within the 
chloroplast stroma. Yet another alternative embodiment comprises expression of these proteins 
as a fusion protein, the carboxy terminus of one of these proteins being linked either directly, by 
a flexible amino acid sequence linker, or by an amino acid sequence linker comprising a 
sequence susceptible to protease or autocatalytic cleavage upon expression or subcellular 
localization of the expression product fusion protein, or allowing cleavage of the linker region 
upon ingestion and localization of the fusion protein to the midgut of a target insect larvae, 
resulting in the release of the two proteins into the cellular milieu or into the midgut digestive 
fluids in approximately equimolar proportions and allowing the two proteins to be activated as a 
biologically active insecticidal crystal protein. Still as another alternative embodiment, tIClOO 
and tIClOl can be mixed with other related binary toxins in various compositions or proportions 
in order to achieve a broader host range, improved insecticidal specificity, or improved 
insecticidal activity. For example, tIClOl could be presented to a coleopteran insect in 
approximately equimolar concentrations with ET33, resulting in a surprisingly effective 
coleopteran insecticidal toxin. tIClOO could be presented to a coleopteran insect in 
approximately equimolar concentrations with ET34 also resulting in a surprisingly effective 
coleopteran insecticidal toxin. Alternatively, these toxin components could be presented to a 
susceptible coleopteran insect in the form of fusions resulting in a surprisingly effective 
coleopteran insecticidal toxin. In yet another embodiment, these toxins could be presented 
together (tIClOO, tIClOl, ET33, and ET34, together or in various compositions exhibiting 
insecticidal activity) to a coleopteran insect in a composition which facilitates insect resistance 
management practices. Alternatively, these toxin compositions could be provided with other 
coleopteran toxins such as for example Cry22, Cry3, or ET70 to provide surprisingly effective 
compositions for increasing insect resistance management. Additional resistance management 
practices contemplated herein include compositions of insecticidal proteins disclosed herein 
along with non-Bacillus thuringiensis insecticidal proteins, for example, insecticidal proteins 
isolatable from other species known in the art which have been shown to be insecticidal such as 
Xenorhabdus and Photorhabdus species of bacteria. 

Another aspect of the invention comprises transgenic plants that express one or more 
genes or gene sequences encoding one or more of the novel polypeptide compositions disclosed 
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herein. As used herein, the term "transgenic plant" is intended to refer to a plant that has 
incorporated DNA sequences, including but not limited to genes which are perhaps not normally 
present, DNA sequences not normally transcribed into RNA or translated into a protein 
("expressed"), or any other genes or DNA sequences which one desires to introduce into the 
non-transformed plant, such as genes which may normally be present in the non-transformed 
plant but which one desires to either genetically engineer or to have altered expression. 

Means for transforming a plant cell and the preparation of a transgenic cell line are well- 
known in the art, and are discussed herein. Vectors, plasmids, cosmids, YACs (yeast artificial 
chromosomes) and DNA sequences for use in transforming such cells will, of course, generally 
comprise either the operons, genes, or gene-derived sequences of the present invention, either 
native, or synthetically-derived, and particularly those encoding the disclosed crystal proteins. 
These DNA constructs can further include structures such as promoters, enhancers, introns, 
terminators, operators, polyadenylation signals, or other gene sequences which have positively- 
or negatively-regulating activity upon the particular genes of interest as desired. The DNA 
sequence or gene may encode either a native or modified crystal protein, which will be expressed 
in the resultant recombinant cells, and/or which will impart an improved phenotype to the 
regenerated plant. 

Such transgenic plants may be desirable for increasing the insect inhibitory resistance of 
a monocotyledonous or dicotyledonous plant, by incorporating into such a plant, a nucleic acid 
sequence comprising one or more of the sequences discussed herein and encoding crystal protein 
which is toxic to Coleopteran insects. Particularly preferred plants include corn, cotton, potato, 
soybean, canola, tomato, turf grasses, wheat, vegetables, ornamental plants, fruit trees, and the 
like. 

In a related aspect, the present invention also encompasses a seed produced by the 
transformed plant, a progeny from such seed, and a seed produced by the progeny of the original 
transgenic plant, produced in accordance with the above process. Such progeny and seeds will 
have a crystal protein-encoding nucleic acid sequence stably incorporated into their genome, and 
such progeny plants will preferably inherit the traits conferred by the nucleic acid sequence in 
Mendelian fashion. All such transgenic plants having incorporated into their nuclear genome 
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nucleic acid sequences comprising one or more of the sequences discussed herein and encoding 
one or more crystal proteins or polypeptides are aspects of this invention. 

Plants comprising cells comprising chloroplasts transformed to contain nucleic acid 
sequences encoding the proteins of the present invention are also contemplated. Such plants 
would not be expected to pass these traits to their progeny plants or seeds through Mendelian 
fashion, but instead would pass on these traits to progeny through maternal transmission means 
well known in the art. 

Site-Specific Mutagenesis 

Site-specific mutagenesis is a technique useful in the preparation of individual peptides, 
or biologically functional equivalent proteins or peptides, through specific mutagenesis of the 
underlying nucleic acid sequence. The technique further provides a ready ability to prepare and 
test sequence variants, for example, incorporating one or more of the foregoing considerations, 
by introducing one or more nucleotide sequence changes into the original nucleic acid sequence. 
Means for site-specific mutagenesis provides for the production of nucleic acid sequence variants 
through the use of specific synthetic oligonucleotide sequences which hybridize to the target 
nucleic acid sequence intended to be altered. Such synthetic oligonucleotides comprise the 
nucleic acid sequence of the desired mutation or sequence variant at the target site sequence, as 
well as a sufficient number of nucleotides complementary to the sequences flanking the target 
site sequence, said synthetic oligonucleotide acting as a primer sequence of sufficient size and 
sequence complexity to form a stable heteroduplex with the target nucleic acid sequence at the 
intended target site and generally flanking both sides of the intended target site sequence. 
Typically, a primer of about 17 to 25 nucleotides in length is preferred, with about 5 to 10 
residues on both sides of the sequence being altered. The target site intended to be altered to 
form the variant sequence could incorporate either a single nucleotide, or alternatively could be 
two nucleotides or even more than two nucleotides each adjacent to each other or interspersed 
throughout the synthetic mutagenesis oligonucleotide sequence. One skilled in the art would 
readily recognize that a single nucleotide sequence change would require a synthetic 
oligonucleotide which would be considerably shorter in length than would a synthetic 
oligonucleotide sequence which is intended for use in incorporating two or more changes to the 
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original nucleotide sequence, and therefore would generally, although not always, require longer 
sequences of complementarity to the sequences flanking the intended target site sequence(s). 

Crystal Protein Screening and Detection Kits 

The present invention contemplates methods and kits for screening samples suspected of 
containing crystal protein polypeptides or crystal protein-related polypeptides, or cells producing 
such polypeptides. A kit may contain one or more antibodies of the present invention, and may 
also contain reagent(s) for detecting an interaction between a sample and an antibody of the 
present invention. The provided reagent(s) can be radio-, fluorescently- or enzymatically- 
labeled. The kit can contain a known radio~,flourescent~, hapten-, or enzyme-labeled agent 
capable of binding or interacting with a nucleic acid, protein or antibody of the present invention. 

The reagent(s) of the kit can be provided as a liquid solution, attached to a solid support 
or as a dried powder. Preferably, when the reagent(s) are provided in a liquid solution, the liquid 
solution is an aqueous solution. Preferably, when the reagent(s) provided are attached to a solid 
support, the solid support can be chromatograph media, a test plate having a plurality of wells, or 
a microscope slide. When the reagent(s) provided are a dry powder, the powder can be 
reconstituted by the addition of a suitable solvent, that may be provided. 

In still further embodiments, the present invention concerns immunodetection methods 
and associated kits. It is proposed that the crystal proteins or peptides of the present invention 
may be employed to detect antibodies having reactivity therewith, or, alternatively, antibodies 
prepared in accordance with the present invention, may be employed to detect crystal proteins or 
crystal protein-related epitope-containing peptides. In general, these methods will include first 
obtaining a sample suspected of containing such a protein, peptide or antibody, contacting the 
sample with an antibody or peptide in accordance with the present invention, as the case may be, 
under conditions effective to allow the formation of an immunocomplex, and then detecting the 
presence of the immunocomplex. 

In general, the detection of immunocomplex formation is quite well known in the art and 
may be achieved through the application of numerous approaches. For example, the present 
invention contemplates the application of ELISA, RIA, immunoblot (e.g., dot blot), indirect 
immunofluorescence techniques and the like. Generally, immunocomplex formation will be 
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detected through the use of a label, such as a radiolabel or an enzyme tag (such as alkaline 
phosphatase, horseradish peroxidase, or the like). Of course, one may find additional advantages 
through the use of a secondary binding ligand such as a second antibody or a biotin/avidin ligand 
binding arrangement, as is known in the art. 

For assaying purposes, it is proposed that virtually any sample suspected of comprising 
either a crystal protein or peptide or a crystal protein-related peptide or antibody sought to be 
detected, as the case may be, may be employed. It is contemplated that such embodiments may 
have application in the titering of antigen or antibody samples, in the selection of hybridomas, 
and the like. In related embodiments, the present invention contemplates the preparation of kits 
that may be employed to detect the presence of crystal proteins or related peptides and/or 
antibodies in a sample. Samples may include cells, cell supernatants, cell suspensions, cell 
extracts, enzyme fractions, protein extracts, or other cell-free compositions suspected of 
containing crystal proteins or peptides. Generally speaking, kits in accordance with the present 
invention will include a suitable crystal protein, peptide or an antibody directed against such a 
protein or peptide, together with an immunodetection reagent and a means for containing the 
antibody or antigen and reagent. The immunodetection reagent will typically comprise a label 
associated with the antibody or antigen, or associated with a secondary binding ligand. 
Exemplary ligands might include a secondary antibody directed against the first antibody or 
antigen or a biotin or avidin (or streptavidin) ligand having an associated label. Of course, as 
noted above, a number of exemplary labels are known in the art and all such labels may be 
employed in connection with the present invention. 

The container will generally include a vial into which the antibody, antigen or detection 
reagent may be placed, and preferably suitably subsequently distributed into samples intended 
for analysis. The kits of the present invention will also typically include a means for containing 
the antibody, antigen, and reagent containers in close confinement for commercial sale. Such 
containers may include injection or blow-molded plastic containers into which the desired vials 
are retained. 
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Biological Functional Equivalents 

Modification and changes may be made in the structure of the peptides of the present 
invention and nucleic acid sequences which encode them and still obtain a functional molecule 
that encodes a protein or peptide with desirable characteristics. The following is a discussion 
based upon changing the amino acids of a protein to create an equivalent, or even an improved, 
second-generation molecule. In particular embodiments of the invention, mutated or variant 
crystal proteins are contemplated to be useful for increasing the insect inhibitory activity of the 
protein, and consequently preferably increasing the insect inhibitory activity and/or expression of 
the recombinant transgene in a plant cell. The amino acid changes may be achieved by changing 
the codons of the DNA sequence, according to the codons given in Table 1 . 
TABLE 1. Amino Acids and Corresponding Codons 



Amino Acids 





* 


* * 




Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


UGC 


UGU 










Aspartate 


Asp 


D 


GAC 


GAU 










Glutamate 


Glu 


E 


GAA 


GAG 










Phenylalanine Phe 


F 


UUC 


UUU 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 






Histidine 


His 


H 


CAC 


CAU 










Isoleucine 


He 


I 


AUA 


AUC 


AUU 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


CUG 


CUU 


Methionine 


Met 


M 


AUG 












Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGU 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


UCC 


UCG 


UCU 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Trp 


W 


UGG 












Tyrosine 


Tyr 


Y 


UAC 


UAU 











Codons 



* indicates three letter abbreviation for the corresponding amino acid name 
** indicates single letter abbreviation for the corresponding amino acid name 
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For example, certain amino acids, known as conservative amino acids, may be substituted 
for other amino acids in a protein structure without appreciable loss of interactive binding 
capacity with structures such as, for example, antigen-binding regions of antibodies or binding 
sites on substrate molecules. Since it is the interactive capacity and nature of a protein that 
defines the protein's biological functional activity, certain amino acid sequence substitutions can 
be made in a protein sequence, and, of course, its underlying DNA coding sequence, and 
nevertheless obtain a protein with like properties. It is thus contemplated by the inventors that 
various changes may be made in the peptide sequences of the disclosed compositions, or 
corresponding DNA sequences which encode said peptides without appreciable loss of their 
biological utility or activity. 

In making such changes, the hydropathic index of amino acids may be considered. The 
importance of the hydropathic amino acid index in conferring interactive biologic function on a 
protein is generally understood in the art (Kyte and Doolittle, 1982, incorporate herein by 
reference). It is accepted that the relative hydropathic character of the amino acid contributes to 
the secondary structure of the resultant protein, which in turn defines the interaction of the 
protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, 
antigens, and the like. 

Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are: isoleucine 
(-1-4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine 
(+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (~ 
1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); 
asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

It is known in the art that certain amino acids may be substituted for other amino acids 
having a similar hydropathic index or score and still result in a protein with similar biological 
activity, i.e., still obtain a biological functionally equivalent protein. In making such changes, 
the substitution of amino acids whose hydropathic indices are within +/-0.2 are preferred, those 
which are within +/- 0.1 are particularly preferred, and those within +/- 0.05 are even more 
particularly preferred. 
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It is also understood in the art that the substitution of like amino acids can be made 
effectively on the basis of hydrophilicity. U.S. Pat. No. 4,554,101, incorporated herein by 
reference, discloses that the greatest local average hydrophilicity of a protein, as governed by the 
hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein. 

As detailed in U.S. Pat. No. 4,554,101, the following hydrophilicity values have been 
assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0.+-.1); glutamate 
(+3.0.+-.1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
proline (-0.5.+-.1); alanine (-0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (- 
1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

It is understood that an amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent, and in particular, an 
immunologically equivalent protein. In such changes, the substitution of amino acids whose 
hydrophilicity values are within +/~ 0.2 are preferred, those which are within +/- 0.1 are 
particularly preferred, and those within +/- 0.05 are even more particularly preferred. 

As outlined above, amino acid substitutions are generally therefore based on the relative 
similarity of the amino acid side-chain substituents, for example, their hydrophobicity, 
hydrophilicity, charge, size, and the like. Exemplary substitutions which take various of the 
foregoing characteristics into consideration are well known to those of skill in the art and 
include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and 
asparagine; and valine, leucine and isoleucine. 

Crystal Protein Insect Inhibitory Compositions and Methods of Use 

The inventors contemplate that the crystal protein compositions disclosed herein will find 
particular utility as insect inhibitory or insecticidal compositions for topical and/or systemic 
application to field crops, grasses, fruits and vegetables, and ornamental plants. In a preferred 
embodiment, the biological insect inhibitory or insecticidal composition comprises an oil 
flowable suspension of bacterial cells which expresses a novel crystal protein disclosed herein. 
Any bacterial host cell expressing the novel nucleic acid sequences disclosed herein and 
producing a crystal protein is contemplated to be useful, such as B. thuringiensis, B. megaterium, 
B. subtilis, E. coli, or Pseudomonas spp. 
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In another embodiment, the biological insect inhibitory composition comprises a water 
dispersible granule. This granule comprises bacterial cells which express one or more of the 
novel crystal proteins disclosed herein. Bacteria such as B. thuringiensis, B. megaterium, B. 
subtilis, E. coli, or Pseudomonas spp. cells transformed with a DNA sequence disclosed herein 
and expressing one or more of the crystal proteins are also contemplated to be useful. 

In a third embodiment, the biological insect inhibitory or insecticidal composition 
comprises a wettable powder, dust, pellet, or colloidal concentrate. This powder comprises 
bacterial cells which express one or more of the novel crystal proteins disclosed herein. Bacteria 
such as B. thuringiensis, B. megaterium, B. subtilis, E. coli, or Pseudomonas spp. cells 
transformed with one or more of the nucleic acid sequences disclosed herein and expressing the 
crystal protein are also contemplated to be useful. Such dry forms of the insect inhibitory 
compositions may be formulated to dissolve immediately upon wetting, or alternatively, dissolve 
in a controlled-release, sustained-release, or other time-dependent manner. 

In a fourth embodiment, the biological insect inhibitory or insecticidal composition 
comprises an aqueous suspension of bacterial cells such as those described above which express 
the crystal protein. Such aqueous suspensions may be provided as a concentrated stock solution 
which is diluted prior to application, or alternatively, as a diluted solution ready-to-apply. 
For methods involving application of bacterial cells, the cellular host containing the crystal 
protein gene(s) may be grown in any convenient nutrient medium, where the DNA construct 
provides a selective advantage, providing for a selective medium so that substantially all or all of 
the cells retain the B. thuringiensis gene. These cells may then be harvested in accordance with 
conventional means. Alternatively, the cells can be treated prior to harvesting. 

When the insect inhibitory or insecticidal compositions comprise intact B. thuringiensis 
cells expressing the protein of interest, such bacteria may be formulated in a variety of ways. 
They may be employed as wettable powders, granules or dusts, by mixing with various inert 
materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, phosphates, and the 
like) or botanical materials (powdered corncobs, rice hulls, walnut shells, and the like). The 
formulations may include spreader-sticker adjuvants, stabilizing agents, other pesticidal 
additives, or surfactants. Liquid formulations may be aqueous-based or non-aqueous and 
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employed as foams, suspensions, emulsifiable concentrates, or the like. The ingredients may 
include rheological agents, surfactants, emulsifiers, dispersants, or polymers. 

Alternatively, the novel proteins discussed and claimed herein may be prepared by native 
or recombinant bacterial expression systems in vitro and isolated for subsequent field 
application. Such protein may be either in crude cell lysates, suspensions, colloids, etc., or 
alternatively may be purified, refined, buffered, and/or further processed, before formulating in 
an active biocidal formulation. Likewise, under certain circumstances, it may be desirable to 
isolate crystals and/or spores from bacterial cultures expressing the crystal protein and apply 
solutions, suspensions, or collodial preparations of such crystals and/or spores as the active 
bioinsect inhibitory composition. 

Regardless of the method of application, the amount of the active component(s) is applied 
at an insect inhibitory- or insecticidally- effective amount, which will vary depending on such 
factors as, for example, the specific coleopteran-inhibitory insects to be controlled, the specific 
plant or crop to be treated, the environmental conditions, and the method, rate, and quantity of 
application of the insect inhibitory-active composition. 

The insect inhibitory compositions described may be made by formulating either the 
bacterial cell, crystal and/or spore suspension, or isolated protein component with the desired 
agriculturally-acceptable carrier. The compositions may be formulated prior to administration in 
an appropriate means such as lyophilized, freeze-dried, dessicated, or in an aqueous carrier, 
medium or suitable diluent, such as saline or other buffer. The formulated compositions may be 
in the form of a dust or granular material, or a suspension in oil (vegetable or mineral), or water 
or oil/water emulsions, or as a wettable powder, or in combination with any other carrier material 
suitable for agricultural application. Suitable agricultural carriers can be solid or liquid and are 
well known in the art. The term "agriculturally-acceptable carrier" covers all adjuvants, e.g., 
inert components, dispersants, surfactants, tackifiers, binders, etc. that are ordinarily used in 
insecticide formulation technology; these are well known to those skilled in insecticide 
formulation. The formulations may be mixed with one or more solid or liquid adjuvants and 
prepared by various means, e.g., by homogeneously mixing, blending and/or grinding the insect 
inhibitory or insecticidal composition with suitable adjuvants using conventional formulation 
techniques. 
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The insect inhibitory or insecticidal compositions of this invention are applied to the 
environment of the target coleopteran insect, typically onto the foliage of the plant or crop to be 
protected, by conventional methods, preferably by spraying. The strength and duration of insect 
inhibitory or insecticidal application will be set with regard to conditions specific to the 
particular pest(s), crop(s) to be treated and particular environmental conditions. The proportional 
ratio of active ingredient to carrier will naturally depend on the chemical nature, solubility, and 
stability of the insect inhibitory or insecticidal composition, as well as the particular formulation 
contemplated. 

Other application techniques, e.g., dusting, sprinkling, soaking, soil injection, seed 
coating, seedling coating, spraying, aerating, misting, atomizing, and the like, are also feasible 
and may be required under certain circumstances such as e.g., insects that cause root or stalk 
infestation, or for application to delicate vegetation or ornamental plants. These application 
procedures are also well-known to those of skill in the art. 

The insect inhibitory or insecticidal composition of the invention may be employed in the 
method of the invention singly or in combination with other compounds, including and not 
limited to other pesticides. The method of the invention may also be used in conjunction with 
other treatments such as surfactants, detergents, polymers or time-release formulations. The 
insect inhibitory or insecticidal compositions of the present invention may be formulated for 
either systemic or topical use. 

The concentration of insect inhibitory or insecticidal composition which is used for 
environmental, systemic, or foliar application will vary widely depending upon the nature of the 
particular formulation, means of application, environmental conditions, and degree of biocidal 
activity. Typically, the bioinsect inhibitory or insecticidal composition will be present in the 
applied formulation at a concentration of at least about 1% by weight and may be up to and 
including about 99% by weight. Dry formulations of the compositions may be from about 1% to 
about 99% or more by weight of the composition, while liquid formulations may generally 
comprise from about 1% to about 99% or more of the active ingredient by weight. Formulations 
which comprise intact bacterial cells will generally contain from about 10 4 to about 10 7 cells/mg. 

The insect inhibitory or insecticidal formulation may be administered to a particular plant 
or target area in one or more applications as needed, with a typical field application rate per 
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hectare ranging on the order of from about 50 g to about 500 g of active ingredient, or of from 
about 500 g to about 1000 g, or of from about 1000 g to about 5000 g or more of active 
ingredient. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The drawings form part of the present specification and are included to further 
demonstrate certain aspects of the present invention. The invention may be better understood by 
reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

Figure 1 is a schematic representation of a CryET33/CryET34 fusion protein or a 
tIClOO/tlClOl fusion protein linked together in frame by a coding sequence flanked by 
BamBl and Nhel restriction sites and encoding a peptide sequence comprising Gly-Ser- 
Gly-Gly-Ala-Ser. 

Figure 2 is a schematic representation of a CryET34/CryET33 or a tIClOl/tlClOO fusion 
protein linked together in frame by a coding sequence flanked by BamHI and Nhel 
restriction sites and encoding a peptide sequence comprising Gly-Ser-Gly-Gly-Ala-Ser. 

Figure 3 illustrates the results of a boll-weevil diet-overlay bioassay using a lepidopteran diet 
containing 0.1% stigmastanol for particular CryET33/CryET34 (sIC200 and sIC2001) 
and tIClOO/tlClOl (sIC2006, sIC2007, and sIC2008) fusions. 

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 
Some Advantages of the Invention 

CryET33 and CryET34 incombination, and tIClOO and tIClOl in combination, are both 
two-component insecticidal protein systems, each derived from different Bacillus thuringiensis 
strains, and requiring both of the two proteins in an approximately equimolar ratio for 
bioactivity. Therefore, for either system to be effective, both proteins need to be present at the 
same time in order to confer protection to plant against coleopteran species insect infestation, 
and to boll weevil in particular. Each of the proteins are expressed from different coding 
sequences in their respective strain of Bt, however, each set of proteins, i.e., CryET33 and 
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CryET34 or CrytlClOO and CrytlClOl, are expressed together in Bt from a polycistronic 
messenger RNA transcribed from a single DNA sequence in which both coding sequences are 
linked together in the genome. Therefore, the ability to express both proteins as a single 
construct in plants would eliminate several problems associated with attempting to express two 
separate proteins concurrently in a transgenic plant system. The major advantage of the fusion 
construct is that both proteins will be expressed simultaneously as they are under the control of a 
single promoter element. It is readily apparent to one skilled in the art that the simultaneous 
expression of two constructs in planta to achieve equimolar ratios of the proteins would be much 
more difficult than enabling the expression of one construct. A corollary to this benefit then, is 
that expression of both proteins in a single cassette would simplify subsequent breeding. In a 
subsequent breeding, the gene encoding both proteins would be transmitted to the progeny, or 
not at all, depending on whether the parent transmitting the gene was homozygous or 
heterozygous for the trait at the locus of the gene within the chromosome containing the gene. 
However, by expressing the proteins from a common cassette, the situation where only one gene 
of the pair is transmitted to subsequent generations will not occur if the genes are present on 
different expression cassettes and distal from each other on the same chromosome or on different 
chromosomes, thus reducing the complexity of the breeding of plants with the insect inhibitory 
protein expressed. Deletion of one gene of the pair by a crossover between elements in common 
within an expression cassette would render this inhibitory or insecticidal system of binary toxins 
derived from Bacillus thuringiensis ineffective. A fusion protein would be protected from such 
an occurrence, as both proteins would be expressed concurrently from within a single expression 
cassette. Expression as a fusion protein would also eliminate problems of gene silencing 
experienced with expression of two novel proteins under the control of similar promoter 
elements. 
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Definitions 

The following words and phrases have the meanings set forth below. 

Expression: The combination of intracellular processes, including transcription and translation 
undergone by a coding DNA molecule such as a structural gene to produce a polypeptide. 

Promoter: A recognition site on a DNA sequence or group of DNA sequences that provide an 
expression control element for a structural gene and to which RNA polymerase specifically binds 
and initiates RNA synthesis (transcription) of that gene. 

Regeneration: The process of growing a plant from a plant cell (e.g., plant protoplast or explant). 
Structural gene: A gene that is expressed to produce a polypeptide. 

Susceptible insect larva: an insect larva which, upon having orally ingested a sample of diet 
containing one or more of the proteins of the present invention, the diet being either artificially 
produced or obtained from a plant tissue artificially coated with or expressing one or more of the 
proteins of the present invention from a recombinant gene or genes, is growth inhibited as 
measured by failure to gain weight, molting cycle frequency inhibition, observed lethargic 
behaviour, reduction in frass production, or death in comparison to either 1) a larvae which does 
not exhibit any of these indications when feeding upon the same diet provided to a susceptible 
larvae, or 2) a larvae which is feeding upon a control diet which does not contain the one or more 
proteins of the present invention. 

Transformation: A process of introducing an exogenous DNA sequence (e.g., a vector, a 
recombinant DNA molecule) into a cell or protoplast in which that exogenous DNA is 
incorporated into a chromosome or is capable of autonomous replication. 
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Transformed cell: A cell whose genetic composition, either chromosomal DNA or other 
naturally occurring intracellular DNA, has been altered by the introduction of an exogenous 
DNA molecule into the genetic composition of that cell. 

Transgenic cell: Any cell derived or regenerated from a transformed cell or derived from a 
transgenic cell. Exemplary transgenic cells include plant calli derived from a transformed plant 
cell and particular cells such as leaf, root, stem, e.g., somatic cells, or reproductive (germ) cells 
obtained from a transgenic plant regenerated from a transformed cell. 

Transgenic plant: A plant or progeny thereof derived from a transformed plant cell or protoplast, 
wherein the plant DNA contains an introduced exogenous DNA molecule not originally present 
in a native, non-transgenic plant of the same strain. The terms "transgenic plant" and 
"transformed plant" have sometimes been used in the art as synonymous terms to define a plant 
containing an exogenous and artificially introduced DNA molecule within its own naturally 
occurring genetic composition. However, it is thought more scientifically correct to refer to a 
regenerated plant or callus obtained from a transformed plant cell or protoplast as being a 
transgenic plant, and that usage will be followed herein. 

Vector: A DNA molecule capable of replication in a host cell and/or to which another DNA 
sequence can be operatively linked so as to bring about replication of the attached sequence. A 
plasmid is an exemplary vector. 

Probes And Primers 

In another aspect, nucleic acid sequence information provided by the invention allows for 
the preparation of relatively short DNA (or RNA) sequences having the ability to specifically 
hybridize to nucleic acid sequences of the selected polynucleotides disclosed herein. In these 
aspects, nucleic acid probes of an appropriate length are prepared based on a consideration of the 
nucleic acid sequence encoding the selected crystal protein, e.g., a sequence such as that shown 
in SEQ ID NO:l or SEQ ID NO:3. The ability of such nucleic acid probes to specifically 
hybridize to a crystal protein-encoding nucleic acid sequence lends to those probes particular 
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utility in a variety of embodiments. Most importantly, the probes may be used in a variety of 
assays for detecting the presence of complementary sequences in a given sample suspected of 
containing probe-complementary sequences. 

In certain embodiments, it is advantageous to use oligonucleotide primers. The sequence 
of such primers is designed using a polynucleotide of the present invention for use in detecting, 
amplifying,modifying, or mutating a defined sequence of nucleic acid encoding a crystal protein 
from B. thuringiensis using thermal amplification technology. Sequences of related crystal 
protein genes from other species may also be amplified by thermal amplification technology 
using such primers. 

In accordance with the present invention, a preferred nucleic acid sequence employed for 
hybridization studies or assays includes sequences that are complementary to at least a 14 to 30 
or so long nucleotide sequence derived from a crystal protein-encoding sequence, such as that 
shown in SEQ ID NO:l or SEQ ID NO: 3. A size of at least 14 nucleotides in length helps to 
ensure that the fragment will be of sufficient length to form a duplex molecule that is both stable 
and selective. Molecules having complementary sequences over stretches greater than 14 bases 
in length are generally preferred, though, in order to increase stability and selectivity of the 
hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained 
through probe hybridization. One will generally prefer to design nucleic acid molecules having 
sequence-complementary stretches of 14 to 20 nucleotides, or even longer where desired. Such 
fragments may be readily prepared by, for example, directly synthesizing the fragment by 
chemical means, by application of nucleic acid reproduction technology, such as thermal 
amplification technology disclosed in U.S. Pat. Nos. 4,683,195, and 4,683,202, herein 
incorporated by reference, or by excising selected DNA fragments from recombinant plasmids 
containing appropriate inserts and suitable restriction sites. 

Expression Vectors 

The present invention contemplates expression vectors comprising a polynucleotide of 
the present invention. Thus, in one embodiment an expression vector is an isolated and purified 
DNA molecule comprising a promoter operatively linked to an coding region that encodes a 
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polypeptide of the present invention, which coding region is operatively linked to a transcription- 
terminating region, whereby the promoter drives the transcription of the coding region. 

As used herein, the term "operatively linked" means that a promoter is connected to an 
coding region in such a way that the transcription of that coding region is controlled and 
regulated by that promoter. Means for operatively linking a promoter to a coding region are well 
known in the art. 

In a preferred embodiment, the recombinant expression of DNAs encoding the crystal 
proteins of the present invention is preferable in a Bacillus host cell. Preferred host cells include 
B. thuringiensis, B. rnegaterium, B. subtilis, and related bacilli, with B. thuringiensis host cells 
being highly preferred. Promoters that function in bacteria are well-known in the art. An 
exemplary and preferred promoter for the Bacillus crystal proteins include any of the known 
crystal protein gene promoters, including the cryET33 and cryET34 gene promoters. 
Alternatively, mutagenized or recombinant crystal protein-encoding gene promoters may be 
engineered by the hand of man and used to promote expression of the novel gene sequences 
disclosed herein. 

In an alternate embodiment, the recombinant expression of DNAs encoding the crystal 
proteins of the present invention is performed using a transformed Gram-negative bacterium 
such as an E. coli or Pseudomonas spp. host cell. Promoters which function in high-level 
expression of target polypeptides in E. coli and other Gram-negative host cells are also well- 
known in the art. 

Where an expression vector of the present invention is to be used to transform a plant, a 
promoter is selected that has the ability to drive expression in plants. Promoters that function in 
plants are also well known in the art. Useful in expressing the polypeptide in plants are 
promoters that are inducible, viral, synthetic, constitutive as described (Poszkowski et al., 1989; 
Odell et al., 1985), and temporally regulated, spatially regulated, and spatio-temporally regulated 
(Chauetal., 1989). 

A promoter is also selected for its ability to direct the transformed plant cell's or 
transgenic plant's transcriptional activity to the coding region. Structural genes can be driven by 
a variety of promoters in plant tissues. Promoters can be near-constitutive, such as the CaMV 
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35S promoter, or tissue-specific or developmentally specific promoters affecting dicots or 
monocots. 

Where the promoter is a near-constitutive promoter such as CaMV 35S, increases in 
polypeptide expression are found in a variety of transformed plant tissues (e.g., callus, leaf, seed 
and root). Alternatively, the effects of transformation can be directed to specific plant tissues by 
using plant integrating vectors containing a tissue-specific promoter. 

An exemplary tissue-specific promoter is the lectin promoter, which is specific for seed 
tissue. The Lectin protein in soybean seeds is encoded by a single gene (Lei) that is only 
expressed during seed maturation and accounts for about 2 to about 5% of total seed mRNA. 
The lectin gene and seed-specific promoter have been fully characterized and used to direct seed 
specific expression in transgenic tobacco plants (Vodkin et al., 1983; Lindstrom et al., 1990). 

An expression vector containing a coding region that encodes a polypeptide of interest is 
engineered to be under control of the lectin promoter and that vector is introduced into plants 
using, for example, a protoplast transformation method (Dhir et al., 1991). The expression of the 
polypeptide is directed specifically to the seeds of the transgenic plant. 

A transgenic plant of the present invention produced from a plant cell transformed with a 
tissue specific promoter can be crossed with a second transgenic plant developed from a plant 
cell transformed with a different tissue specific promoter to produce a hybrid transgenic plant 
that shows the effects of transformation in more than one specific tissue. 

Exemplary tissue-specific promoters are corn sucrose synthetase 1 (Yang et al., 1990), 
corn alcohol dehydrogenase 1 (Vogel et al., 1989), corn light harvesting complex (Simpson, 
1986), corn heat shock protein (Odell et al., 1985), pea small subunit RuBP carboxylase (Poulsen 
et al., 1986; Cashmore et al., 1983), Ti plasmid mannopine synthase (Langridge et al., 1989), Ti 
plasmid nopaline synthase (Langridge et al., 1989), petunia chalcone isomerase (Van Tunen et 
al., 1988), bean glycine rich protein 1 (Keller et al., 1989), CaMV 35s transcript (Odell et al., 
1985) and Potato patatin (Wenzler et al., 1989). Preferred promoters are the cauliflower mosaic 
virus (CaMV 35S) promoter and the S-E9 small subunit RuBP carboxylase promoter. 

The choice of which expression vector and ultimately to which promoter a polypeptide 
coding region is operatively linked depends directly on the functional properties desired, e.g., the 
location and timing of protein expression, and the host cell to be transformed. These are well 
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known limitations inherent in the art of constructing recombinant DNA molecules. However, a 
vector useful in practicing the present invention is capable of directing the expression of the 
polypeptide coding region to which it is operatively linked. 

Typical vectors useful for expression of genes in higher plants are well known in the art 
and include vectors derived from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens 
described (Rogers et al., 1987). However, several other plant integrating vector systems are 
known to function in plants including pCaMV CN transfer control vector described (Fromm et 
al., 1985). Plasmid pCaMVCN (available from Pharmacia, Piscataway, NJ.) includes the 
cauliflower mosaic virus CaMV 35S promoter. 

In preferred embodiments, the vector used to express the polypeptide includes a selection 
marker that is effective in a plant cell, preferably a drug resistance selection marker. One 
preferred drug resistance marker is the gene whose expression results in kanamycin resistance; 
i.e., the chimeric gene containing the nopaline synthase promoter, Tn5 neomycin 
phosphotransferase II (nptll) and nopaline synthase 3 r non-translated region described (Rogers et 
al., 1988). 

RNA polymerase transcribes a coding DNA sequence through a site where 
polyadenylation occurs. Typically, DNA sequences located a few hundred base pairs 
downstream of the polyadenylation site serve to terminate transcription. Those DNA sequences 
are referred to herein as transcription-termination regions. Those regions are required for 
efficient polyadenylation of transcribed messenger RNA (mRNA). 

Means for preparing expression vectors are well known in the art. Expression 
(transformation vectors) used to transform plants and methods of making those vectors are 
described in U.S. Pat Nos. 4,971,908, 4,940,835, 4,769,061 and 4,757,011, the disclosures of 
which are incorporated herein by reference. Those vectors can be modified to include a coding 
sequence in accordance with the present invention. 

A variety of methods has been developed to operatively link DNA to vectors via 
complementary cohesive termini or blunt ends. For instance, complementary homopolymer 
tracts can be added to the DNA sequence to be inserted and to the vector DNA. The vector and 
DNA sequence are then joined by hydrogen bonding between the complementary 
homopolymeric tails to form recombinant DNA molecules. 
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A coding region that encodes a polypeptide which confers insect inhibitory activity to a 
cell transformed to express the polypeptide is preferably a sequence encoding a tIClOO and/or 
tIClOl polypeptide, or a CryET33/CryET34 fusion peptide, a CryET34/CryET33 fusion peptide, 
a tIClOO/tlClOl fusion peptide, a tIClOl/tlClOO fusion peptide, a CryET33/tIC101 fusion 
peptide, or atIC100/CryET34 fusion peptide, each of these or combinations thereof being further 
defined as B. thuringiensis insecticidal crystal fusion proteins. For example, in preferred 
embodiments, such a coding region has the nucleic acid sequence of SEQ ID NO:ll, SEQ ID 
NO:13, SEQ ID NO:15, or SEQ ID NO:17 to encode a CryET33/CryET34 fusion, or a functional 
equivalent of those sequences. Also, co-expression of coding sequences for either CryET33 or 
tIC 100 along with either CryET34 or tIClOl are shown herein to confer insect inhibitory activity 
to a plant or host cell. 

Characteristics of the Novel Crystal Proteins 

The present invention provides novel polypeptides that define a whole or a portion of 
tIClOO, tIClOl, CryET33/CryET34 fusions, tIClOO/tlClOl fusions, CryET33/tIC101 fusions, 
and tIC100/CryET34 fusions whereby the fusion proteins contain various linkers disclosed and 
claimed herein. Various calculated physical characteristics of tIClOO, tIClOl, 
CryET33/CryET34 fusions containing various linkers, and tIClOO/tlClOl fusions containing 
various linkers are listed below. The calculated physical characteristics of tIC100/CryET34 and 
CryET33/tIC101 fusions are not listed; however, such characteristics could be easily derived 
using known methods by persons skilled in the art. 

tIClOO 

tIC 100 is a protein as set forth in SEQ ID NO:2 derived from a cryptic B. thuringiensis 
DNA sequence. The cryptic tIClOO coding sequence as set forth in SEQ ID NO:l is a part of an 
operon containing the tIClOl coding sequence, and is adjacent to and upstream of the coding 
sequence for tIClOl. The cryptic sequence upstream of tIClOl contains the complete coding 
sequence for tIClOO except that a single guanosine residue at position 84 of the native cryptic 
tIC 100 coding sequence as set forth in SEQ ID NO:27 causes the tIClOO coding sequence to be 
out of frame. The frameshift was eliminated, as described in Example 6 herein, by removing the 
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single guanosine residue at position 84 to create the novel tIClOO coding sequence as set forth in 
SEQ ID NO:l, encoding the tIClOO protein as set forth in the translation in SEQ ID NO:l and in 
the peptide sequence as set forth in SEQ ID NO:2, and as shown herein below. 



5 Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
1 5 10 15 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

10 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 
35 40 45 

Pro Val Asn Asn His lie Thr Thr Lys Val lie Asp Asn Pro Gly Thr 
15 50 55 60 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
65 70 75 80 

20 Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 

85 90 95 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
100 105 110 

25 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
30 130 135 140 

Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr Gly Asn Tyr Asn 
145 150 155 160 

35 Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 

165 170 175 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr Val Ser lie Thr 
180 185 190 

40 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 



Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu Gly Ala Gin Gly 
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210 215 220 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
225 230 235 240 

Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 

245 250 255 

Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg 
260 265 

The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 2. 

Molecular weight = 29239. Residues = 268 
Isoelectric point = 4.79 
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Table 2. Amino Acid Composition of tIClOO 

Residue Type Number of Residues Mole Percent 

IntlClOO Protein 





A = Ala 


15 


5.597 


5 


B = Asx 


0 


0.000 




C = Cys 


1 


0.373 




D = Asp 


16 


5.970 




E = Glu 


14 


5.224 




F = Phe 


8 


2.985 


10 


G = Gly 


21 


7.836 




H = His 


3 


1.119 




I = Ile 


17 


6.343 




K = Lys 


14 


5.224 




L = Leu 


8 


2.985 


15 


M = Met 


4 


1.493 




N = Asn 


18 


6.716 




P = Pro 


12 


4.478 




Q = Gln 


5 


1.866 




R = Arg 


8 


2.985 


20 


S = Ser 


19 


7.090 




T = Thr 


40 


14.925 




V = Val 


27 


10.075 




W = Trp 


2 


0.746 




Y = Tyr 


16 


5.970 


25 


Z = Glx 


0 


0.000 




A + G 


36 


13.433 




S + T 


59 


22.015 




D + E 


30 


11.194 




D + E + N+ Q 


53 


19.776 


30 


H + K + R 


25 


9.328 




D+E+H+K+R 


55 


20.522 




I + L + M + V 


56 


20.896 




F + W + Y 


26 


9.701 



Non-polar 

Polar 

Acidic 

Basic 

Hydrophobic non-aromatic 
Aromatic 



35 



40 



ticioi 

The following amino acid sequence, numbered for convenience, represents an example of 
a CrytlClOl insecticidal protein. The amino acid sequence is represented at SEQ ID NO:4. One 
nucleotide sequence which encodes the tIClOl amino acid sequence is set forth at SEQ ID NO:3, 
which indicates the particular codons observed in the native Bj. coding sequence. 

Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
15 10 15 
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Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
20 25 30 

Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
115 120 125 

The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 3. 
Molecular weight = 14159. Residues = 126 
Isoelectric point = 4.70 
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Table 3. Amino Acid Composition of tIClOl 
Residue Type Number of Residues 

In tIC 101 Protein 



Mole Percent 





A = Ala 
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Jr — rue 


A 


3. 1 / j 


i n 


Or — oiy 


1 0 
1Z 


y.DZ4 




tt — rr- 
xl — xllS 


Z 


1 ^87 
l.JO / 




T _ Tl^ 

1 — lie 


Q 


7 1 /II 






o 


O.JH7 




T = T mi 
JO — IjCU 


^+ 




ID 


1VA 1VJ.CL 


Z 


1 S87 




"\r = a cti 


o 






p — Pro 


Q 


7 143 




O = Gin 

\JJ V_II.11 










2 


1 587 


20 


S = Ser 


9 


7 141 




T = Thr 


10 


7 917 




V = Val 

V V CLk 




4 769. 




W = Trp 


3 


2.381 




Y = Tyr 


10 


7.937 


25 


Z = Glx 


0 


0.000 




A + G 


17 


13.492 




S+T 


19 


5.079 




D + E 


15 


11.905 




D + E + N+ Q 


29 


23.016 


30 


H + K + R 


12 


9.524 




D+E+H+K+R 


27 


21.429 




I + L + M + V 


21 


16.667 




F + W + Y 


17 


13.492 



Non-polar 

Polar 

Acidic 

Basic 

Hydrophobic non-aromatic 
Aromatic 
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tlOOO/tlClOl fusion with BamHI/Nhel (GSGGAS) linker 

The following amino acid sequence, numbered for convenience, represents an example of 
a CrytlClOO/CrytlClOl insecticidal protein fusion between CrytlClOO and CrytlClOl, 
CrytlClOO being positioned at the amino terminus of the fusion, and containing a Gly-Ser-Gly- 
5 Gly-Ala-Ser (GSGGAS) amino acid sequence linker between the two protein sequences. The 
underlined amino acids at residues numbered from position 269 through position 274 indicate the 
linker sequence in this novel insecticidal fusion protein. 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
10 1 5 10 15 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

15 Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 
35 40 45 

Pro Val Asn Asn His lie Thr Thr Lys Val lie Asp Asn Pro Gly Thr 
50 55 60 

20 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
. 65 70 75 80 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
25 85 90 95 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
100 105 110 

30 Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

35 

Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr Gly Asn Tyr Asn 
145 150 155 " 160 

Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
40 165 170 175 



Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Ala Tyr Val Ser He Thr 
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180 185 190 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

5 

Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu Gly Ala Gin Gly 
210 215 220 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
10 225 230 235 240 

Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 

245 250 255 

15 Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg Gly Ser Gly Gly 

260 265 270 

Ala Ser Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn 
275 280 285 

20 

Glu Gly Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr 
290 295 300 

Leu Gin Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp 
25 305 310 315 320 

Gly Lys Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie 

325 330 335 

30 Ser Ser Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp 

340 345 350 

Val Lys Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro 
355 360 365 

35 

Ser Gin Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly 
370 375 380 

Asp Glu Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
40 385 390 395 400 

The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 4. 
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Molecular weight = 43796. Residues = 400 
Isoelectric point = 4.75 



Table 4. Amino Acid Composition of tIClOO/tlClOl Fusion TBamHI/Nhel (GSGGAS) Linker] 
Residue Type Number of Residues Mole Percent 



IntIClOO/101 Protein 



A — Ala 


O 1 

21 


d.Zjk) 




D — ASX 


U 


a f\f\f\ 
U.UUU 




C = Cys 


3 


A 7CA 
U./OU 




D = Asp 


O A 

24 


O.UUU 




T"? 

XI — vjrlU 


1 1 

21 


C O^A 




r — Jrne 


1 o 

12 


'J AAA 
J.UUU 




<jr — Urly 


Jo 


O AAA 

y.uuu 




it it: c 

Jbl — ills 




1 .2DU 




1 — lie 


2o 


£ ^AA 
O.OUU 




Jv — j_,ys 


22 


j.jUU 




L — Leu 


1 o 

12 


1 AAA 

J.UUU 




M = Met 


6 


1.500 




N = Asn 


26 


6.500 




P = Pro 


21 


5.250 




Q = Gln 


11 


2.750 




R = Arg 


10 


2.500 




S = Ser 


30 


7.500 




T = Thr 


50 


12.500 




V = Val 


33 


8.250 




W = Trp 


5 


1.250 




Y = Tyr 


26 


6.500 




Z = Glx 


0 


0.000 




A + G 


57 


14.250 


Non-polar 


S + T 


80 


20.000 


Polar 


D + E 


45 


11.250 


Acidic 


D + E + N+ Q 


82 


20.500 




H + K + R 


37 


9.250 


Basic 


D+E+H+K+R 


82 


20.500 




I + L + M + V 


77 


19.250 


Hydrophobic non-aromatic 


F + W + Y 


43 


10.750 


Aromatic 



An insecticidal fusion protein similar to the tIClOO/tlClOl fusion described in above and 
in Table 4 was constructed, but the DNA sequence representing the open reading frame encoding 
tIClOl peptide was positioned at the 5 f end of the cassette so that the tIClOl peptide would be 
positioned at the amino terminal position of the fusion protein, while the DNA sequence 
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representing the open reading frame encoding the tIClOO peptide was positioned toward the 3' 
end of the cassette so that the tIClOO peptide would be positioned at the carboxy terminal 
position of the fusion protein. The two proteins were also linked in frame by a sequence 
encoding a Gly-Ser-Gly-Gly-Ala-Ser (GSGGAS) linker peptide as described above. The 
resulting amino acid sequence of the fusion peptide, tIC101/tIC100 ? was identical in amino acid 
composition analysis to the tIClOO/tlClOl fusion peptide described in Table 4, and exhibiting a 
molecular weight of 43796 Da, comprising 400 amino acid residues, and exhibiting a calculated 
isoelectric point of 4.75. This fusion peptide was also shown to demonstrate an effective 
coleopteran insect inhibitory bioactivity, in particular in cotton boll weevil bioassay. The amino 
acid sequence of the tIClOl/tlClOO fusion peptide linked in frame by a GSGGAS linker is 
shown below, and the underlined residues at amino acid sequence positions 127-132 represent 
the GSGGAS linker: 



Met Thr Val Tyr Asn Val Thr Phe 
1 5 

Glu Trp Gly Gly Pro Glu Pro Tyr 
20 

Asn Pro Asp His Asn Phe Glu lie 
35 40 

Asp Thr Pro Glu Lys Ser Ser His 
50 55 

Pro Thr Gly Gly Pro lie Asn Gin 
65 70 



Thr lie Lys Phe Tyr Asn Glu Gly 
10 15 

Gly Lys lie Tyr Ala Tyr Leu Gin 
25 30 

Trp Ser Gin Asp Asn Trp Gly Lys 

45 

Thr Gin Thr lie Lys lie Ser Ser 
60 

Met Cys Phe Tyr Gly Asp Val Lys 
75 80 



Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala Glv Ser 
115 120 125 



Gly Gly Ala Ser Met Gly He He Asn He Gin Asp Glu He Asn Asp 
130 135 140 
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Tyr Met Lys Gly Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp 
145 150 155 160 

5 Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val 

165 170 175 

lie Pro Thr Glu Pro Val Asn Asn His lie Thr Thr Lys Val lie Asp 
180 185 190 

10 

Asn Pro Gly Thr Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr 
195 200 205 

Glu Thr Asp Thr Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly 
15 210 215 220 

Gly Ser Val Ser Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser 
225 230 235 240 

20 Asp Val Thr Val Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu 

245 250 255 

Thr Thr Thr Lys Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val 
260 265 270 

25 

Lys Ala Pro Pro Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr 
275 280 285 

Gly Asn Tyr Asn Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr 
30 290 295 300 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr 
305 310 315 320 

35 Val Ser lie Thr Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr 

325 330 335 

Asn Glu Gly Asn Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu 
340 345 350 

40 

Gly Ala Gin Gly Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val 
355 360 365 



Asp Asp Asn Gly Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly 
45 370 375 380 
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Ser Leu Ala Pro Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg 
385 390 ' 395 " 400 



tlClOl/tlOOO fusion with Gly-GIy linker 

The following amino acid sequence, numbered for convenience, represents an example of 
a CrytlClOl/CrytlClOO insecticidal protein fusion between CrytlClOl and CrytlClOO, 
CrytlClOl being positioned at the amino terminus of the fusion, and containing a Gly-Gly (GG) 
dipeptide linker between the two protein sequences. The underlined amino acids at residues 
numbered from position 127 through position 128 indicate the linker sequence in this novel 
insecticidal fusion protein. 



Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
1 5 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
20 25 30 

Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala Gly Gly 
115 120 125 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
130 135 140 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
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145 

Val Phe Asn Glu Ser 

165 

5 

Pro Val Asn Asn His 
180 

Ser Glu Val Thr "ser 
10 195 

Val Thr Ser Ala Val 
210 

15 Ser Lys Ala Thr Phe 
225 

Thr Val Ser Ala Glu 

245 

20 

Thr Asp Thr Arg Thr 
260 

Arg Thr Asn Val Glu 
25 2 75 

Val Pro Val Asn Val 
290 

30 Gly Tyr Arg Asp Gly 
305 

Asp Leu Ala Asp Tyr 

325 

35 

Gly Val Ala His Phe 
340 

Leu Arg Ser Tyr lie 
40 355 

Arg His Ser lie Pro 
370 

45 Asn Val Thr Leu lie 
385 



150 155 

Val Thr Pro Gin Tyr Asp 

170 

lie Thr Thr Lys Val lie 
185 

Thr Val Thr Phe Thr Trp 
200 

Thr Lys Gly Tyr Lys Val 
215 

Lys Phe Ala Phe Val Thr 
230 235 

Tyr Asn Tyr Ser Thr Thr 

250 

Trp Thr Asp Ser Thr Thr 
265 

Val Ala Tyr He He Gin 
280 

Glu Ser Asp Met Thr Gly 
295 

Ala Leu He Ala Ala Ala 
310 315 

Asn Pro Asn Leu Gly Leu 

330 

Lys Gly Glu Gly Tyr He 
345 

Gin Val Thr Glu Tyr Pro 
360 

Lys Thr Tyr He He Lys 
375 

Asn Asp Arg Lys Glu Gly 
390 395 



160 

Val He Pro Thr Glu 
175 

Asp Asn Pro Gly Thr 
19 0 

Thr Glu Thr Asp Thr 
205 

Gly Gly Ser Val Ser 
220 

Ser Asp Val Thr Val 

240 

Glu Thr Thr Thr Lys 
255 

Val Lys Ala Pro Pro 
270 

Thr Gly Asn Tyr Asn 
285 

Thr Leu Phe Cys Arg 
300 

Tyr Val Ser He Thr 

320 

Thr Asn Glu Gly Asn 
335 

Glu Gly Ala Gin Gly 
350 

Val Asp Asp Asn Gly 
365 

Gly Ser Leu Ala Pro 
380 

Arg 
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The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 5. 
Molecular weight = 43494. Residues = 396 
Isoelectric point = 4.75 



Table 5. Amino Acid Composition of tIC 100/1 01 Fusion [Glv-Glv Linker] 
Residue Type Number of Residues Mole Percent 



In tIC 100 Protein 



A = Ala 


20 


5.051 


B = Asx 


0 


0.000 


C = Cys 


3 


0.758 


D = Asp 


24 


6.061 


E = Glu 


21 


5.303 


F = Phe 


12 


3.030 


G = Gly 


35 


8.838 


H = His 


5 


1.263 


I = Ile 


26 


6.566 


K = Lys 


22 


5.556 


L = Leu 


12 


3.030 


M = Met 


6 


1.515 


N = Asn 


26 


6.566 


P = Pro 


21 


5.303 


Q = Gin 


11 


2.778 


R = Arg 


10 


2.525 


S = Ser 


28 


7.071 


T = Thr 


50 


12.626 


V = Val 


33 


8.333 


W = Trp 


5 


1.263 


Y = Tyr 


26 


6.566 


Z = Glx 


0 


0.000 


A + G 


55 


13.889 


S + T 


78 


19.697 


D + E 


45 


11.364 


D + E + N+ Q 


82 


0.707 


H + K + R 


37 


9.343 


D+E+H+K+R 


82 


20.707 


I + L + M + V 


77 


19.444 


F + W + Y 


43 


10.859 



Non-polar 

Polar 

Acidic 

Basic 

Hydrophobic non-aromatic 
Aromatic 
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CryET33/CryET34 fusion with BamHVNhel (GSGGAS) linker 

The following amino acid sequence, numbered for convenience, represents an example of 
a CryET33/CryET34 insecticidal protein fusion between CryET33 and CryET34 and containing 
a Gly-Ser-Gly-Gly-Ala-Ser (GSGGAS) amino acid sequence linker between the two protein 
5 sequences. The underlined amino acids at residues numbered from position 268 through position 
273 indicate the linker sequence in this novel insecticidal fusion protein. 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 
15 10 15 

10 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
15 35 40 45 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

20 Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 

85 90 95 

25 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
30 115 1 20 . 125 

Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

35 Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

165 170 . 175 

40 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 
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Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu Gly Ala Gin Gly 
5 210 215 220 

Leu Arg Ser He He Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
225 230 235 240 

io Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly Ser Leu Ala Pro 

245 250 255 

Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe Gly Ser Gly Gly Ala 

260 265 270 

15 

Ser Met Thr Val Tyr Asn Ala Thr Phe Thr He Asn Phe Tyr Asn Glu 
275 280 285 

Gly Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu 
20 290 295 300 

Thr Asn Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly 
305 310 315 320 

25 Lys Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser 

325 330 335 

Ser Asp Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val 
340 345 350 

30 

Lys Glu Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser 
355 36 0 3 65 

Gin Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp 
35 370 375 380 

Glu Lys Gly Ser Tyr Val Thr He Lys Tyr Ser Leu Thr Pro Ala 
385 390 395 

The resulting protein is calculated to comprise the following composition, including the 
40 amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 6. 
Molecular weight = 43792. Residues = 399 
Isoelectric point = 4.53 
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Table 6. Amino Acid Composition of CrvET33/CrvET34 Fusion TBamHI/Nhel (GSGGAS^ Linker 
Residue Type Number of Residues Mole Percent 

In CrvET33/34 Protein 



A = Ala 


20 


5.013 




B = Asx 


0 


0.000 




C = Cys 


4 


1.003 




D = Asp 


23 


5.764 




E = Glu 


22 


5.514 




F = Phe 


15 


3.759 




G = Gly 


32 


8.020 




H = His 


4 


1.003 




I = Ile 


25 


6.266 




K = Lys 


22 


5.514 




L = Leu 


16 


4.010 




M = Met 


5 


1.253 




N = Asn 


28 


7.018 




P = Pro 


20 


5.013 




Q = Gin 


11 


2.757 




R = Arg 


7 


1.754 




S = Ser 


33 


8.271 




T = Thr 


52 


13.033 




V = Val 


30 


7.519 




W = Trp 


5 


1.253 




Y = Tyr 


25 


6.266 




Z = Glx 


0 


0.000 




A + G 


52 


13.033 


Non-polar 


S + T 


85 


21.303 


Polar 


D + E 


45 


11.278 


Acidic 


D + E + N+ Q 


84 


21.053 




H + K + R 


33 


8.271 


Basic 


D+E+H+K+R 


78 


19.549 




I + L + M + V 


76 


19.048 


Hydrophobic non-aromatic 


F + W + Y 


45 


11.278 


Aromatic 
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CryET33/CryET34 fusion with (GGGS) 3 linker 

The following amino acid sequence, numbered for convenience, represents an example of 
a CryET33/CryET34 insecticidal protein fusion between CryET33 and CryET34 and containing 
a Gly-Ser-Gly-Gly-Gly-Ser-Gly-Gly-Gly-Ser-Gly-Gly-Gly-Ser-Ala-Ser (GGGS) 3 amino acid 
sequence linker between the two protein sequences. The underlined amino acids at residues 
numbered from position 268 through position 283 indicate the linker sequence in this novel 
insecticidal fusion protein. 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 
15 10 15 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 • 45 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 

85 90 95 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 ~ 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 

Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

165 170 ^ 175 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 
180 185 " 190 
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Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

5 Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu Gly Ala Gin Gly 
210 215 220 

Leu Arg Ser Val lie Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
225 230 235 240 

10 

Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly Ser Leu Ala Pro 

245 250 255 

Asn Val Thr Leu Lys Asn Ser Asn lie Lys Phe Glv Ser Gly Gly Gly 
15 260 265 270 

Ser Gly Glv Glv Ser Gly Gly Glv Ser Ala Ser Met Thr Val Tyr Asn 
275 280 285 

20 Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly Glu Trp Gly Gly Pro 
290 295 300 

Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr Asn Pro Asp His Asp 
305 310 315 320 

25 

Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys Ser Thr Pro Glu Arg 

325 330 335 

Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser Asp Thr Gly Ser Pro 
30 340 345 350 

lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu Tyr Asp Val Gly 
355 360 365 

35 Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin Lys Val Cys Ser Thr 
370 375 380 

Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys Gly Ser Tyr Val 
385 390 395 400 

40 

Thr lie Lys Tyr Ser Leu Thr Pro Ala 

405 
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The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 7. 
Molecular weight = 44453. Residues = 409 
Isoelectric point = 4.53 

Table 7. Amino Acid Composition of CrvET33/CrvET34 KGGGSY* Linker! 
Residue Type Number of Residues Mole Percent 



In CrvET33/34 Protein 



A = ALA 



20 



4.890 



L = Leu 
M = Met 
N = Asn 
P = Pro 
Q = Gln 
R = Arg 
S = Ser 
T = Thr 

V = Val 
W = Trp 

Y = Tyr 
Z-Glx 
A + G 

S + T 
D + E 



C = Cys 
D = Asp 
E = Glu 
F = Phe 
G = Gly 
H = His 
I = Ile 
K = Lys 



B = Asx 



D + E + N+ Q 
H + K + R 
D+E+H+K+R 
I + L + M + V 
F + W + Y 



0 
4 

23 

22 

15 

39 

4 

25 

22 

16 

5 

28 
20 
11 
7 

36 
52 
30 
5 

25 
0 

59 
88 
45 
84 
33 
78 
76 
45 



0.000 

0.978 

5.623 

5.379 

3.667 

9.535 

0.978 

6.112 

5.379 

3.912 

1.222 

6.846 

4.890 

2.689 

1.711 

8.802 

12.714 

7.335 

I. 222 
6.112 
0.000 
14.425 
21.516 

II. 002 
20.538 
8.068 
19.071 
18.582 
11.002 



Non-polar 

Polar 

Acidic 



Basic 



Hydrophobic non-aromatic 
Aromatic 
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CryET33/CryET34 fusion with lysine oxidase (PALLKEAPRAEEELPP) linker 

The following amino acid sequence, numbered for convenience, represents an example of 
a CryET33/CryET34 insecticidal protein fusion between CryET33 and CryET34 and containing 
a lysine oxidase amino acid sequence linker between the two protein sequences. The underlined 
5 amino acids at residues numbered from position 268 through position 287 indicate the lysine 
oxidase linker sequence in this novel insecticidal fusion protein. 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 
15 10 15 

10 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
15 35 40 45 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

20 Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 

85 90 95 

25 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
30 115 120 125 

Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

35 Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

165 170 175 

40 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 
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Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu Gly Ala Gin Gly 
5 210 215 220 

Leu Arg Ser Val lie Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
225 230 235 240 

io Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly Ser Leu Ala Pro 

245 250 255 

Asn Val Thr Leu Lys Asn Ser Asn lie Lys Phe Gly Ser Pro Ala Leu 

260 265 270 

Leu Lys Glu Ala Pro Arcr Ala Glu Glu Glu Leu Pro Pro Ala Ser Met 
275 280 285 

Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly Glu 
20 290 295 300 

Trp Gly Gly Pro Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr Asn 
305 310 315 320 

25 Pro Asp His Asp Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys Ser 

325 330 335 

Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser Asp 
340 345 350 

30 

Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu 
355 360 365 

Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin Lys 
35 370 375 380 

Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys 
385 390 395 400 

40 Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala 

405 410 
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The resulting protein is calculated to comprise the following composition, including the 
amino acid sequence residues, number of each amino acid residue, and mole percent of each 
combination of residues of a particular species as set forth in Table 8. 
Molecular weight = 45420. Residues = 413 
Isoelectric point = 4.51 



Table 8. Amino Acid Composition CrvET33/ET34 Fusion Tlvsine oxidase 
(TALLKEAPRAEEELPP) linker 1 

Residue Type Number of Residues Mole Percent 



In CrvET33/34 Protein 



A = Ala 


23 


5.569 




B = Asx 


0 


0.000 




C = Cys 


4 


0.969 




D = Asp 


23 


5.569 




E = Glu 


26 


6.295 




F = Phe 


15 


3.632 




G = Gly 


30 


7.264 




H = His 


4 


0.969 




I = Ile 


25 


6.053 




K = Lys 


23 


5.569 




L = Leu 


19 


4.600 




M = Met 


5 


1.211 




N = Asn 


28 


6.780 




P = Pro 


24 


5.811 




Q = Gln 


11 


2.663 




R = Arg 


8 


1.937 




S = Ser 


33 


7.990 




T = Thr 


52 


12.591 




V = Val 


30 


7.264 




W = Trp 


5 


1.211 




Y = Tyr 


25 


6.053 




Z = Glx 


0 


0.000 




A + G 


53 


12.833 


Non-polar 


S + T 


85 


20.581 


Polar 


D + E 


49 


11.864 


Acidic 


D + E + N+ Q 


88 


21.308 




H + K + R 


35 


8.475 


Basic 


D+E+H+K+R 


84 


20.339 




I + L + M + V 


79 


19.128 


Hydrophobic non-aromatic 


F + W + Y 


45 


10.896 


Aromatic 
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Nomenclature of the Novel Proteins 

The inventors have arbitrarily assigned the designations tIClOO and UC101 to the novel 
proteins, and //CI 00 and tICl 01 to the novel nucleic acid sequences encoding the respective 
polypeptides. Formal assignment of gene and protein designations based on the revised 
nomenclature of crystal protein endotoxins may be assigned by a committee on the nomenclature 
of B. thuringiensis, formed to systematically classify B. thuringiensis crystal proteins. The 
inventors contemplate that the official nomenclature assigned to these sequences will supercede 
the arbitrarily assigned designations of the present invention. 

Transformed Host Cells and Transgenic Plants 

Methods and compositions for transforming a bacterium, a yeast cell, a plant cell, or an 
entire plant with one or more expression vectors comprising a crystal protein-encoding gene 
sequence are further aspects of this disclosure. A transgenic bacterium, yeast cell, plant cell or 
plant derived from such a transformation process or the progeny and seeds from such a 
transgenic plant are also further embodiments of the invention. 

Means for transforming bacteria and yeast cells are well known in the art. Typically, 
means of transformation are similar to those well known means used to transform other bacteria 
or yeast such as E. coli or Saccharomyces cerevisiae. Methods for DNA transformation of plant 
cells include Agrobacterium-mediated plant transformation, protoplast transformation, gene 
transfer into pollen, injection into reproductive organs, injection into immature embryos and 
particle bombardment. Each of these methods has distinct advantages and disadvantages. Thus, 
one particular method of introducing genes into a particular plant strain may not necessarily be 
the most effective for another plant strain, but it is well known which methods are useful for a 
particular plant strain. 

There are many methods for introducing transforming DNA sequences into cells, but not 
all are suitable for delivering DNA to plant cells. Suitable methods are believed to include 
virtually any method by which DNA can be introduced into a cell, such as by Agrobacterium 
infection, direct delivery of DNA such as, for example, by PEG-mediated transformation of 
protoplasts (Omirulleh et al., 1993), by desiccation/inhibition-mediated DNA uptake, by 
electroporation, by agitation with silicon carbide fibers, by acceleration of DNA coated particles, 
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etc. In certain embodiments, acceleration methods are preferred and include, for example, 
microprojectile bombardment and the like. 

Technology for introduction of DNA into cells is well-known to those of skill in the art. 
Four general methods for delivering a gene into cells have been described: (1) chemical methods 
(Graham and van der Eb, 1973; Zatloukal et al., 1992); (2) physical methods such as 
microinjection (Capecchi, 1980), electroporation (Wong and Neumann, 1982; Fromm et al., 
1985; U.S. Pat. No. 5,384,253) and the gene gun (Johnston and Tang, 1994; Fynan et al., 1993); 
(3) viral vectors (Clapp, 1993; Lu et al., 1993; Eglitis and Anderson, 1988a; 1988b); and (4) 
receptor-mediated mechanisms (Curiel et al., 1991; 1992; Wagner et al., 1992). 

Electroporation 

The application of brief, high-voltage electric pulses to a variety of animal and plant cells 
leads to the formation of nanometer-sized pores in the plasma membrane. DNA is taken directly 
into the cell cytoplasm either through these pores or as a consequence of the redistribution of 
membrane components that accompanies closure of the pores. Electroporation can be extremely 
efficient and can be used both for transient expression of clones genes and for establishment of 
cell lines that carry integrated copies of the gene of interest. Electroporation, in contrast to 
calcium phosphate-mediated transfection and 

protoplast fusion, frequently gives rise to cell lines that carry one, or at most a few, integrated 
copies of the foreign DNA. 

The introduction of DNA by means of electroporation is well-known to those of skill in 
the art. In this method, certain cell wall-degrading enzymes, such as pectin degrading enzymes, 
are employed to render the target recipient cells more susceptible to transformation by 
electroporation than untreated cells. Alternatively, recipient cells are made more susceptible to 
transformation, by mechanical wounding. To effect transformation by electroporation one may 
employ either friable tissues such as a suspension culture of cells, or embryogenic callus, or 
alternatively, one may transform immature embryos or other organized tissues directly. One 
would partially degrade the cell walls of the chosen cells by exposing them to pectin-degrading 
enzymes (pectolyases) or mechanically wounding in a controlled manner. Such cells would then 
be recipient to DNA transfer by electroporation, which may be carried out at this stage, and 
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transformed cells then identified by a suitable selection or screening protocol dependent on the 
nature of the newly incorporated DNA. 

Microprojectile Bombardment 

A further advantageous method for delivering transforming DNA sequences to plant cells 
is microprojectile bombardment. In this method, particles may be coated with nucleic acids and 
delivered into cells by a propelling force. Exemplary particles include those comprised of 
tungsten, gold, platinum, and the like. 

An advantage of microprojectile bombardment, in addition to it being an effective means 
of reproducibly stably transforming monocots, is that neither the isolation of protoplasts (Cristou 
et al., 1988) nor the susceptibility to Agrobacteriwn infection is required. An illustrative 
embodiment of a method for delivering DNA into maize cells by acceleration is a Biolistics 
Particle Delivery System, which can be used to propel particles coated with DNA or cells 
through a screen, such as a stainless steel or Nytex screen, onto a filter surface covered with corn 
cells cultured in suspension. The screen disperses the particles so that they are not delivered to 
the recipient cells in large aggregates. It is believed that a screen intervening between the 
projectile apparatus and the cells to be bombarded reduces the size of projectiles 
aggregate and may contribute to a higher frequency of transformation by reducing damage 
inflicted on the recipient cells by projectiles that are too large. 

For the bombardment, cells in suspension are preferably concentrated on filters or solid 
culture medium. Alternatively, immature embryos or other target cells may be arranged on solid 
culture medium. The cells to be bombarded are positioned at an appropriate distance below the 
macroprojectile stopping plate. If desired, one or more screens are also positioned between the 
acceleration device and the cells to be bombarded. Through the use of techniques set forth 
herein one may obtain up to 1000 or more foci of cells transiently expressing a marker gene. 
The number of cells in a focus which express the exogenous gene product 48 hours post- 
bombardment often range from 1 to 10 and average 1 to 3. 

In bombardment transformation, one may optimize the prebombardment culturing 
conditions and the bombardment parameters to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment are important in 
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this technology. Physical factors are those that involve manipulating the DNA/microprojectile 
precipitate or those that affect the flight and velocity of either the macro- or microprojectiles. 
Biological factors include all steps involved in manipulation of cells before and immediately 
after bombardment, the osmotic adjustment of target cells to help alleviate the trauma associated 
with bombardment, and also the nature of the transforming DNA, such as linearized DNA or 
intact supercoiled plasmids. It is believed that pre-bombardment manipulations are especially 
important for successful transformation of immature embryos. 

Accordingly, it is contemplated that one may wish to adjust various of the bombardment 
parameters in small scale studies to fully optimize the conditions. One may particularly wish to 
adjust physical parameters such as gap distance, flight distance, tissue distance, and helium 
pressure. One may also minimize the trauma reduction factors (TRFs) by modifying conditions 
which influence the physiological state of the recipient cells and which may therefore influence 
transformation and integration efficiencies. For example, the osmotic state, tissue hydration and 
the subculture stage or cell cycle of the recipient cells may be adjusted for optimum 
transformation. The execution of other routine adjustments will be known to those of skill in the 
art in light of the present disclosure. 

Agrobacterium-Mediated Transfer 

Agrobacterium-mediated transfer is a widely applicable system for introducing genes into 
plant cells because the DNA can be introduced into whole plant tissues, thereby bypassing the 
need for regeneration of an intact plant from a protoplast. The use of Agrobacterium-mzdiated 
plant integrating vectors to introduce DNA into plant cells is well known in the art. See, for 
example, the methods described (Fraley et al., 1985; Rogers et al., 1987). Further, the 
integration of the Ti-DNA is a relatively precise process resulting in few rearrangements. The 
region of DNA to be transferred is defined by the border sequences, and intervening DNA is 
usually inserted into the plant genome as described (Spielmann et al., 1986; Jorgensen et al., 
1987). 

Modem Agrobacterium transformation vectors are capable of replication in E. coli as 
well as Agrobacterium, allowing for convenient manipulations as described (Klee et al., 1985). 
Moreover, recent technological advances in vectors for Agrobacterium-mediated gene transfer 
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have improved the arrangement of genes and restriction sites in the vectors to facilitate 
construction of vectors capable of expressing various polypeptide coding genes. The vectors 
described (Rogers et al., 1987), have convenient multi-linker regions flanked by a promoter and 
a polyadenylation site for direct expression of inserted polypeptide coding genes and are suitable 
for present purposes. In addition, Agrobacterium containing both armed and disarmed Ti genes 
can be used for the transformations. In those plant strains where Agrobacteriun-mediatcd 
transformation is efficient, it is the method of choice because of the facile and defined nature of 
the gene transfer. 

Agrobacterium-mediated transformation of leaf disks and other tissues such as cotyledons 
and hypocotyls appears to be limited to plants that Agrobacterium naturally infects. 
Agrobacterium-medmted transformation is most efficient in dicotyledonous plants. Few 
monocots appear to be natural hosts for Agrobacterium;, although transgenic plants have been 
produced in asparagus using Agrobacterium vectors as described (Bytebier et al., 1987). 
Therefore, commercially important cereal grains such as rice, corn, and wheat must usually be 
transformed using alternative methods. However, as mentioned above, the transformation of 
asparagus using Agrobacterium can also be achieved (see, for example, Bytebier et al., 1987). 
Recently, Jinjiang et al. (US Patent Serial No. 6,037,522; 2000) disclosed a method for efficient 
Agrobacterium mediated transformation of monocots. 

A transgenic plant regenerated from Agrobacterium mediated transformation methods 
typically contains a single simple insert on one chromosome. Such transgenic plants can be 
referred to as being heterozygous for the added insert, and for coding sequences contained within 
the insert. However, inasmuch as use of the word "heterozygous" usually implies the presence 
of a complementary sequence at the same locus of the second chromosome of a pair of 
chromosomes, and there is no such sequence in a plant containing a single simple insert, it is 
believed that a more accurate name for such a plant is an independent segregant, because the 
added, exogenous single simple insert segregates independently during mitosis and meiosis. 

More preferred is a transgenic plant that is homozygous for the added structural coding 
sequence; i.e., a transgenic plant that contains two or more coding sequences artificially 
introduced using transgenic methods, for example by Agrobacterium mediated transformation, 
one coding sequence at the same locus on each chromosome of a chromosome pair. A 
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homozygous transgenic plant can be obtained by sexually mating (selfmg) an independent 
segregant transgenic plant that contains a single artificially introduced coding sequence, 
germinating some of the seed produced and analyzing the resulting plants produced for enhanced 
carboxylase activity relative to a control (native, non-transgenic) or an independent segregant 
transgenic plant. 

It is to be understood that two different transgenic plants can also be mated to produce 
offspring that contain two independently segregating added, exogenous coding sequences. 
Selfing of appropriate progeny can produce plants that are homozygous for both artificially 
introduced simple insert sequences that encode a polypeptides of interest. Back-crossing to a 
parental plant and out-crossing with a non-transgenic plant are also contemplated. 

Other Transformation Methods 

Transformation of plant protoplasts can be achieved using methods based on calcium 
phosphate precipitation, polyethylene glycol treatment, electroporation, and combinations of 
these treatments (see, e.g., Potrykus et al., 1985; Lorz et al., 1985; Fromm et al., 1986; Uchimiya 
et al., 1986; Callis et al., 1987; Marcotte et al., 1988). 

Application of these systems to different plant strains depends upon the ability to regenerate that 
particular plant strain from protoplasts. Illustrative methods for the regeneration of cereals from 
protoplasts are described (Fujimura et al., 1985; Toriyama et al., 1986; Yamada et al., 1986; 
Abdullah etal., 1986). 

To transform plant strains that cannot be successfully regenerated from protoplasts, other 
ways to introduce DNA into intact cells or tissues can be utilized. For example, regeneration of 
cereals from immature embryos or explants can be effected as described (Vasil, 1988). In 
addition, "particle gun" or high-velocity microprojectile technology can be utilized. (Vasil, 
1992). 

Using that latter technology, DNA is carried through the cell wall and into the cytoplasm 
on the surface of small metal particles as described (Klein et al., 1987; Klein et al., 1988; 
McCabe et al., 1988). The metal particles penetrate through several layers of cells and thus 
allow the transformation of cells within tissue explants. 
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Methods for Producing Insect-Resistant Transgenic Plants 

By transforming a suitable host cell, such as a plant cell, for example with a sequence 
encoding a CryET33/CryET34 fusion peptide or a tIClOO and/or tIClOl peptide(s), the 
expression of the encoded crystal fusion protein (i.e., a bacterial crystal protein or polypeptide 
having coleopterari-ixMbitory activity) can result in the formation of insect-resistant plants. 

By way of example, one may utilize an expression vector containing a coding region for a 
B. thuringiensis crystal protein and an appropriate selectable marker to transform a suspension of 
embryonic plant cells, such as wheat or corn cells using a method such as particle bombardment 
(Maddock et al., 1991; Vasil et al., 1992) to deliver the DNA coated on microprojectiles into the 
recipient cells. Transgenic plants are then regenerated from transformed embryonic calli that 
express the insect inhibitory proteins. 

The formation of transgenic plants may also be accomplished using other methods of cell 
transformation which are known in the art such as Agrobacferium-medmtGd DNA transfer 
(Fraley et al., 1983; Jinjiang et al., 2000). Alternatively, DNA can be introduced into plants by 
direct DNA transfer into pollen (Zhou et al., 1983; Hess, 1987; Luo et al., 1988), by injection of 
the DNA into reproductive organs of a plant (Pena et al., 1987), or by direct injection of DNA 
into the cells of immature embryos followed by the rehydration of desiccated embryos (Neuhaus 
et al., 1987; Benbrook et al., 1986). 

The regeneration, development, and cultivation of plants from single plant protoplast 
transformants or from various transformed explants is well known in the art (Weissbach and 
Weissbach, 1988). This regeneration and growth process typically includes the steps of selection 
of transformed cells, culturing those individualized cells through the usual stages of embryonic 
development through the rooted plantlet stage. Transgenic embryos and seeds are similarly 
regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant 
growth medium such as soil. 

The development or regeneration of plants containing the foreign, exogenous gene that 
encodes a polypeptide of interest introduced by Agrobacterium from leaf explants can be 
achieved by methods well known in the art such as described (Horsch et al., 1985). In this 
procedure, transformants are cultured in the presence of a selection agent and in a medium that 
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induces the regeneration of shoots in the plant strain being transformed as described (Fraley et 
al. ? 1983). 

This procedure typically produces shoots within two to four months and those shoots are 
then transferred to an appropriate root-inducing medium containing the selective agent and an 
antibiotic to prevent bacterial growth. Shoots that rooted in the presence of the selective agent to 
form plantlets are then transplanted to soil or other media to allow the production of roots. These 
procedures vary depending upon the particular plant strain employed, such variations being well 
known in the art. 

Preferably, the regenerated plants are self-pollinated to provide homozygous transgenic 
plants, as discussed before. Otherwise, pollen obtained from the regenerated plants is crossed to 
seed-grown plants of agronomically important, preferably inbred lines. Conversely, pollen from 
plants of those important lines is used to pollinate regenerated plants. A transgenic plant of the 
present invention containing a desired polypeptide is cultivated using methods well known to 
one skilled in the art. 

A transgenic plant of this invention thus has an increased amount of a coding region (e.g., 
a cry gene) that encodes the Cry polypeptide of interest. A preferred transgenic plant is an 
independent segregant and can transmit that gene and its activity to its progeny. A more 
preferred transgenic plant is homozygous for that gene, and transmits that gene to all of its 
offspring on sexual mating. Seed from a transgenic plant may be grown in the field or 
greenhouse, and resulting sexually mature transgenic plants are self-pollinated to generate true 
breeding plants. The progeny from these plants become true breeding lines that are evaluated 
for, by way of example, increased insect inhibitory capacity against coleopteran insects, 
preferably in the field, under a range of environmental conditions. The inventors contemplate 
that the present invention will find particular utility in the creation of transgenic plants of 
commercial interest including various cotton, potato, soybean, canola, tomato, turf grasses, 
wheat, com, rice, barley, oats, a variety of ornamental plants and vegetables, as well as a number 
of nut- and fruit-bearing trees and plants. 
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Illustrative Embodiments 

This application discloses novel insecticidal proteins isolatable from Bacillus 
thuringiensis strains of bacterium, and in particular, insecticidal proteins exhibiting Coleopteran 
insecticidal activity. For the purposes of this disclosure, the phrase "insect inhibitory" should be 
correlated with the word "insecticidal", and these words and phrases are meant to be used 
interchangeably herein throughout. A composition comprising one or more of the peptides 
disclosed herein is considered to be insecticidal, and the term insecticidal, and by analogy "insect 
inhibitory", is intended to be defined as a protein which, upon ingestion into the digestive system 
of a target insect, causes morbidity and mortality, in that the target insect, having consumed a 
quantity of the protein is discouraged from eating further, and preferably the target insect's 
growth is stunted or reduced, and more preferably the target insect is subjected to drying, 
desiccation, and death upon eating an amount of a substance containing the insecticidal protein 
in an amount sufficient to cause growth inhibition, feeding inhibition, rejection of a substance 
containing the protein as a food source, and preferably death. 

An exemplary insecticidal composition comprises a sample which contains, in 
approximately equimolar concentrations, both of the proteins herein defined as CrytlClOO and 
CrytlClOl, alternatively known as tIClOO and tIClOl. These proteins have been identified as 
being expressible from a nucleotide sequence obtained from Bacillus thuringiensis strain 
EG9328. In the course of identifying Bacillus thuringiensis strains which exhibit Coleopteran 
activity, sequences complementary to the binary toxin composition CryET33 and CryET34 were 
used as probes and primers for hybridizing to and/or amplifying sequences from B.t strains 
exhibiting Coleopteran insecticidal activity. As a result of this hybridization and thermal 
amplification analysis, several strains were identified as containing DNA sequences which 
contain sequences exhibiting substantial homology to cryET33 and cryET34 DNA sequences 
and which provided a template for the thermal amplification reaction which produced one or 
more bands separable upon agarose gel electrophoresis and ethidium bromide staining similar in 
size to the operon sequence encoding the CryET33 and CryET34 proteins. It was suspected that 
these bands all encoded the ET33 and ET34 proteins or homologs thereof. It was surprising that 
one particular clone isolated from this amplification analysis failed to produce any crystal 
morphology when transformed into an acrystalliferous strain of B.t Furthermore, DNA 
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sequence analysis of this particular clone resulted in the identification of a sequence which may 
have, in evolutionary terms, previously encoded at least two proteins similar but not identical to 
CryET33 and CryET34. This sequence, and the cryptic operon contained within the sequence, 
isolated from B.t. strain EG9328, is set forth herein in SEQ ID NO:27. While it is impossible to 
predict whether throughout evolutionary time there was one or more bases added to the sequence 
to disrupt the coding sequence of CrytlClOO, or whether there were one or more bases that were 
removed from the sequence to disrupt the coding sequence of CrytlClOO, or even whether there 
was ever a CrytlClOO protein ever produced by a Bacillus thuringiensis in nature, the fact 
remains that removing one of the cytosine residues from nucleotide position 84 through 88 
within the cryptic sequence as set forth in SEQ IDNO:27 causes the reading frame from 
nucleotide position 1 through nucleotide position 804 to shift such that a single open reading 
frame is created which allows this "corrected" sequence to encode the peptide herein described 
as tIC 100. When expressed along with tIClOl, or when tIClOO and tIClOl are present in a 
sample in approximately equimolar ratios, the combination of the two proteins results in an 
insecticidal composition, in particular when provided in an orally acceptable diet to a 
Coleopteran target insect. In particular, the Coleopteran target insect most prevalently affected 
by the tIClOO and tIClOl composition is a boll weevil insect, which is prevalently found as a 
pest among cotton crops in the new world, i.e., in North America, Mexico, Central and South 
America, and Australia. It was also found by the inventors herein that fusions between these two 
proteins exhibited insecticidal activity when tested against the boll weevil, and that it was 
irrelevant whether the protein fusion contained CrytlClOO or CrytlClOl at the amino terminus of 
the fusion protein. It was also determined that it was irrelevant as to which proteolytically 
susceptible amino acid sequence linker was present and in frame between the two CrytIC 
proteins, so long as the linker sequence was capable of being cleaved when the fusion protein 
was ingested in an orally acceptable medium by the boll weevil. 

The orally acceptable insect diet or orally administrable diet into which the insecticidal 
proteins of the present invention are to be incorporated are well known in the art as described 
herein. These can be any composition which can be orally ingested by the target insect pest 
taking the form for example, when the proteins or fusions of the present invention are expressed 
from within a host cell such as a plant, fungal, or bacterial cell, consisting of a cell extract, a cell 
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suspension, a cell homogenate, a cell lysate, a cell supernatant, a cell filtrate, or a cell pellet. In 
addition, the composition containing the insecticidal protein(s) of the present invention can be 
formulated into a powder, a dust, a pellet, a granule, a spray, an emulsion, a colloid, or a 
solution, any of which can be topically applied to a substrate which is or can become an orally 
ingestible, orally acceptable, or an orally administrable diet for a target insect pest. The 
formulation can be prepared in a number of ways well known in the art, including but not to be 
limited to dessication, lyophilization, homogenization, extraction, filtration, centrifugation, 
sedimentation, or concentration. In any such orally acceptable, orally administrable, or orally 
ingestible diet intended for consumption by a target insect pest, the protein of the present 
invention should at least be present in a concentration from about 0.001% of the total weight of 
the composition to about 99% of the weight of the composition. 

In view of the nature of the target pest shown herein to be susceptible to the compositions 
disclosed herein, it is intended that nucleotide sequences be synthesized for expression of the 
proteinaceous agents of the present invention in plant cells, and in particular in cotton plant cells. 
It is well known that Bacillus thuringiensis DNA sequences encoding insecticidal proteins are 
not preferred for expression of the proteins encoded thereby in plants. Instead, it has been 
demonstrated time and again that the preferred DNA sequences for expression in plants should 
be artificially synthesized in order to maximize the levels of expression of the insecticidal 
proteins in plants. Therefore, it has previously been demonstrated that multiple DNA sequences, 
because of the redundancy of the genetic code, can encode the same or a substantially identical 
protein encoded by the native DNA sequence, i.e. "native" intended to mean "derived as found in 
nature, or as found in the genome of Bacillus thuringiensis, or in this case, because the coding 
sequence derived from a plasmid naturally occurring within a particular strain of Bacillus 
thuringiensis". Therefore, the prior art teachings indicating which codons to use when preparing 
a particular nucleotide sequence for expression of a Bt toxin in plants have been extensively 
referred to and those disclosures, well known in the art, are intended to be within the scope of 
this invention. 
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EXAMPLES 

The following examples are included to demonstrate preferred embodiments of the 
invention. It should be appreciated by those of skill in the art that the techniques disclosed in the 
examples which follow represent techniques discovered by the inventor to function well in the 
practice of the invention, and thus can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments which are disclosed and still obtain 
a like or similar result without departing from the spirit and scope of the invention. 

Example 1 - Construction of CryET33/CryET34 Insect Inhibitory Fusion Protein 

This example illustrates the construction of a DNA sequence encoding a CryET33 and 
CryET34 insect inhibitory fusion protein. 

CryET33 and CryET34 peptides and nucleic acid sequences encoding these novel 
peptides have been disclosed previously, at least in U.S. Patent Serial No. 6,063,756. In order to 
determine whether a CryET33/CryET34 fusion can be expressed as a single protein and retain 
bioactivity against boll weevil, a CryET33/CryET34 fusion was constructed based on the wild- 
type Bacillus thuringiensis sequences encoding the CryET33 and CryET34 peptides. An 
expression construct in pMON47407, a Bacillus thuringiensis universal expression vector, was 
constructed in which the CryET33 coding sequence was downstream of and adjacent to a 
Bacillus thuringiensis sporulation specific promoter at the 5'-end of the construct, and the 
CryET34 coding sequence was positioned downstream of and adjacent to the CryET33 coding 
sequence at the 3 '-end of the cassette, mimicking the natural orientation within the native B.t 
cryET33 and cryET34 operon. A BamHX/Nhel linker sequence encoding the amino acid 
sequence represented by Gly-Ser-Gly-Gly-Ala-Ser (GSGGAS) was introduced in frame between 
the CryET33 and CryET34 coding sequences to allow for protein flexibility as well as providing 
a convenient restriction site sequence for introducing other linkers if necessary (see Fig. 1). The 
sequence encoding the CryET33/CryET34 fusion was constructed using overlapping thermal 
amplification mutagenesis, and incorporated an Spel site at the 5' -end and an Xhol site at the 3 - 
end of the cassette coding sequence. The thermal amplification product was cloned into a pPCR- 
Script™ vector, and the sequence of the fusion was verified by double-stranded sequencing. The 
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Spel/XhoI-fxagment containing the CryET33/CryET34 fusion peptide coding sequence was 
cloned into an SpeUXhoI-digested universal B.t. expression shuttle vector pMON47407 indicated 
above, creating plasmid pMON38644 for expression of the CryET33/CryET34 fusion protein in 
B.t strain EG10650, which is a B.t strain which is deficient for the production of any 
insecticidal crystal proteins. The ligation mixture from which pMON38644 was derived was 
transformed directly into the B.t. expression strain EG10650, and colonies suspected of 
containing the expected plasmid were chosen for further analysis after selection on appropriate 
media. 

One colony producing a protein of the expected size was selected for further analysis. 
Plasmid DNA from the transformant was isolated and characterized by restriction enzyme 
analysis. The EG10650 strain containing the plasmid designated as pMON38644 (strain 
sIC2000) formed crystal structures upon sporulation. Spores containing these crystal structures 
were pelleted, washed and subjected to reducing SDS-PAGE analysis, which revealed the 
presence of a protein of the expected size (43.8 kDa) which exhibited little if any signs of 
degradation. The CryET33/CryET34 fusion protein crystals were submitted to qualitative 
bioassay against boll weevil upon solubilization into 10 mM NaHC0 3 , pH 10.0. Both soluble 
and insoluble fractions demonstrated bioactivity against boll weevil in a qualitative diet overlay 
bioassay. 

Example 2 - Construction of a CryET34/CryET33 Fusion in Orientation Opposite to the 
Native Operon with Insect Inhibitory Activity 

This example illustrates the construction of a DNA sequence encoding a CryET34 and 
CryET33 insect inhibitory fusion protein, and illustrates that the Colepteran inhibitory activity of 
a fusion protein between CryET33 and CryET34 is independent of the orientation of the two 
proteins within the fusion. 

A CryET34/CryET33 fusion protein coding sequence was constructed by synthesizing a 
nucleic acid sequence having the CryET34 coding sequence located at the 5'-end, and the 
CryET33 sequence located at the 3 '-end. A BamHl/Nhel linker coding for GSGGAS was also 
introduced between the two coding sequences. The sequence encoding the CryET34/CryET33 
fusion protein was constructed as in example 1 above (see Fig. 2). The thermal amplification 
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product sequence was cloned into a pPCR-Script vector as in example 1, and the sequence was 
verified by double-stranded sequencing. The S^eI/A7?oI-fragment containing the 
CryET34/CryET33 fusion peptide coding sequence was cloned into an 5pe]7A77oI-digested 
universal B.t expression vector pMON47407 resulting in the formation of plasmid pMON38646 
which is useful for expression of the CryET34/CryET33 fusion protein in the B.L crystal minus 
strain EG10650. The pMON38646 ligation mixture was transformed directly into EG1 0650, and 
colonies suspected of containing the expected plasmid were chosen for further analysis after 
selection on the appropriate media. One colony containing a plasmid exhibiting the appropriate 
characteristics was designated as strain sIC2001. 

Growth of strain sIC2001 containing pMON38646 (cryET34/crj/ET33 fusion) revealed 
formation of crystal structures upon sporulation. Spores were pelleted, washed and subjected to 
reducing SDS-PAGE analysis, which revealed the presence of a protein of the expected size 
(43.8 kDa). 

Example 3 - Development of ELISA Assay for CryET33/CryET34 Fusion Proteins 

This example illustrates the development of an ELISA assay for use in detecting and 
measuring the amount of a CryET33 and CryET34 fusion protein in a sample. 

An enzyme-linked immuno-sorbent assay was developed to evaluate the expression of 
CryET33/CryET34 or CryET34/CryET33 fusion proteins in a sample or in an in planta sample. 
Polyclonal IgG, which had been raised against a combination of both CryET33 and CryET34 
proteins, was purified from rabbit serum using Protein A affinity chromatography, and was used 
as the capture or primary (1°) antibody (Ab). A secondary (2°) antibody capable of binding the 
1° antibody was conjugated to an alkaline phosphatase enzyme. A B. L -expressed 
CryET33/CryET34 fusion protein was used as standard reference material. A series of 96-well 
immunoassay plates were loaded using the CryET33/CryET34 fusion protein standard and 
different combinations of 1° and 2° Ab dilutions. A typical CryET33/CryET34 standard curve is 
illustrated in Figure 3. The appropriate dilutions were determined to be 1:500 for 1° Ab and 
1:200 for 2° Ab. The assay was tested qualitatively using tobacco plants expressing 
CryET33/CryET34 fusion protein and the results were confirmed by western blot. These 
tobacco plants were then analyzed quantitatively and the results were found to be reproducible 
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upon repeating the assay (Table 9). The assay has been used to evaluate expression in tobacco 
leaf, cotton callus, cotton leaf and cotton square. 

Table 9. Reproducibility of the CryET33/CryET34 fusion protein ELISA. 



Plant # 


Construct 


1-12-2000, 
ppm 


1-19-2000, 
ppm 


1705-1 


51713 


0.39 


0.39 


1705-2 


51713 


0.23 


0.18 


1705-3 


51713 


0.23 


0.23 


1705-4 


51713 


0.15 


0.15 


1705-5 


51713 


0.46 


0.42 


1740-1 


51719 


1.27 


1.32 


1740-2 


51719 


1.03 


1.02 


1740-3 


51719 


3.15 


3.21 


1740-4 


51719 


1.15 


1.16 


1740-5 


51719 


0.96 


0.96 


1740-6 


51719 


1.29 


1.35 


1740-7 


51719 


1.82 


1.86 


1740-8 


51719 


.2.73 


2.63 


1740-9 


51719 


1.64 


1.60 


1740-10 


51719 


1.75 


1.82 



Example 4 - Expression and Bioactivity of CryET33/CryET34 Fusion Protein in Cotton 
Callus Tissue 

In order to quickly evaluate the in planta performance of the CryET33/CryET34 and 
CryET34/CryET33 fusion proteins, several constructs were made and expressed in cotton callus. 
10 In order to address possible folding or stability problems in plants, several parameters were 
varied. For example, two different linkers were incorporated between the BamHl and Nhel 
restriction sites: 

1) (GGGS) 3 linker to allow for flexibility at the junction point; 
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2) Lysine oxidase cleavage site linker which is known to be cleaved in plants. This 
would allow the two proteins to fold correctly in case the covalent linkage between 
the C-terminus of one protein and the N-terminus of the other causes steric 
perturbance. 

A chloroplast targeting sequence was also used, as well as various promoters. The 
constructs submitted for Agrobacterium--medmted transformation of cotton callus tissue are listed 
below in Table 10 (all constructs contained an NPTII selectable marker). 

Table 10. Plant Transformation Plasmids Containing Various CryET33 and CryET34 

Translational Fusions 



pMON# 


Expression Cassette Description 








Promoter- 


ORFl-Linker- 


ORF2- 


terminator 


51713 


AtEFla 


ET33 BamM-Nhel 


ET34 


E9 


51719 


e35S 


ET33 BamM-Nhel 


ET34 


E9 


51739 


e35S 


ET33 (GGGS) 3 


ET34 


E9 


51740 


e35S 


ET33 LO 


ET34 


E9 


51758 


AtEFla 


ET34 BamM-Nhel 


ET33 


E9 



Transformed cotton callus tissue was lyophilized and subjected to western blotting. Blots 
were probed with anti-CryET33/CryET34 antibodies. The results demonstrate that 
CryET33/CryET34 fusion proteins, with either BamUI/Nhel (pMON51713 and 51719), 
(GGGS) 3 - (pMON51739) or lysine oxidase (pMON51740) linkers, are expressed in transformed 
cotton callus as judged by Western blot, and produce the protein band of expected size (about 44 
kDa). In this example, the best expressor was tissue transformed with plasmid pMON51719. 
Very little degradation of the fusion protein to protrein fragments corresponding in size to the 
individual CryET33 (29 kDa) and CryET34 (14 kDa) proteins was observed, indicating the 
stability of the fusions in cotton callus tissue. A CryET34/CryET33 fusion, constructed in the 
double border plant transformation plasmid pMON51758, however, did not express any protein 
detectable by Western blot in cotton callus tissue. The reason for the failure of this construct to 
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express the fusion protein in planta was not readily identifiable. It is believed however, because 
a CryET34/ET33 fusion produced insecticidal protein of the expected size when expressed from 
a cassette introduced into EG 10650, that successful expression of CryET34/CryET33 fusion 
protein in cotton callus tissue could easily be achieved without undue experimentation. 
5 The expression levels for CryET33/CryET34 fusion proteins in lyophilized cotton callus 

tissue as determined by ELISA are summarized in Table 1 1 . 



Table 11. Expression levels of CryET33/CryET34 fusions in lyophilized cotton callus. 



pMON number 


Date of 
collection 


Protein 


ET33/34 
fusion, 
mg/g tissue 


51713 


08/12/1999 


ET33/34 fusion 


7.17 


51713 


09/14/1999 


ET33/34 fusion 


7.66 


51713 


10/12/1999 


ET33/34 fusion 


7.95 


51713 


02/11/2000 


ET33/34 fusion 


5.34 


51713 


03/03/2000 


ET33/34 fusion 


5.02 


51719 


07/22/1999 


ET33/34 fusion 


14.01 


51719 


09/23/1999 


ET33/34 fusion 


15.46 


51719 


02/11/2000 


ET33/34 fusion 


13.14 


51719 


03/03/2000 


ET33/34 fusion 


14.53 


51739 


11/16/1999 


ET33/34 fusion 


18.68 


51739 


01/12/2000 


ET33/34 fusion 


14.40 


51739 


02/11/2000 


ET33/34 fusion 


6.15 


51739 


03/03/2000 


ET33/34 fusion 


5.62 


51740 


11/16/1999 


ET33/34 fusion 


7.31 


51740 


01/12/2000 


ET33/34 fusion 


6.51 


51740 


02/11/2000 


ET33/34 fusion 


2.64 


51740 


03/03/2000 


ET33/34 fusion 


2.38 


51758 


02/11/2000 


ET34/33 fusion 


0.00 


51758 


03/03/2000 


ET34/33 fusion 


0.00 
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As indicated from the data in Table 11, the highest expression of a CryET33/CryET34 
fusion protein was consistently achieved when using pMON51719. This result is consistent with 
western blotting data. 

In order to determine the bioactivity of the lyophilized callus tissues, the transformed 
callus tissues were tested in a boll weevil diet-overlay bioassay. The results of three independent 
bioassays, and the expression levels for the lyophilized cotton callus tissues expressing 
CryET33/CryET34 fusion protein, are shown in Tables 12-14. As indicated from the data in 
Tables 12-14, callus tissue transformed with plasmid pMON51739 or plasmid pMON51719 
consistently demonstrated significant boll weevil activity. In addition, the results shown in 
Tables 12-14 demonstrate that the transformed tissues exhibiting the greatest boll weevil activity 
correlated well with elevated expression levels as measured by ELISA, so that expression levels 
of the fusion proteins could be used to screen for transformation events exhibiting commercial 
levels of fusion protein expression and coleopteran insect inhibitory bioactivity. 

Table 12. 

Boll Weevil Bioactivity of Lyophilized Cotton Callus Tissues 
Transformed to Express CryET33/CryET34Fusion Protein 



pMON-date of collection 


%Mortality 


ELISA, ppm 


39778* 


0.00 


0 


51713-08/12/99 


0.00 


7.17 


51713-09/14/99 


6.25 


7.66 


51713-10/12/99 


6.67 


7.95 


51719-01/12/00 


16.67 


14.01 


51739-11/16/99 


35.29 


18.68 


51739-01/12/00 


25.00 


14.4 


51740-11/16/99 


5.88 


7.31 


51740-01/12/00 


6.25 


6.51 



^negative or non-transformed control 
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Table 13. 

Boll Weevil Bioactivity of Lyophilized Callus Tissues Transformed to Express 
CryET33/CryET34 



pMON-date of collection 


%MortaIity 


ELISA, ppm 


39778* 


0.00 


0.00 


51713-02/11/00 


0.00 


5.34 


51713-03/23/00 


0.00 


5.02 


51719-02/11/00 


31.25 


13.14 


51719-03/23/00 


40.00 


14.53 


51739-02/11/00 


6.67 


6.15 


51739-03/23/00 


0.00 


5.62 


51740-02/11/00 


0.00 


2.64 


51740-03/23/00 


6.25 


2.38 


51758-02/11/00 


0.00 


0.00 


51758-03/23/00 


6.67 


0.00 



negative control 



5 Table 14. 

Boll Weevil Bioactivity of Lyophilized Callus Tissues Transformed to Express 
CryET33/CryET34 



pMON-date of collection 


%Mortality 


ELISA, ppm 


39778* 


0.00 


0 


51713-08/12/99 


6.67 


5.91 


51713-09/14/99 


7.14 


7.01 


51713-10/12/99 


0.00 


7.26 


51713-03/03/00 


0.00 


5.02 


j 51719-07/22/99 


25.00 


10.96 


51719-09/23/99 


26.67 


11.72 


51719-03/03/00 


31.25 


14.53 


51739-11/16/99 


28.57 


19.47 


51739-02/11/00 


0.00 


6.15 


51739-03/03/00 


0.00 


5.62 


51740-11/16/99 


14.29 


9.93 


51740-01/12/00 


6.25 


4.84 


51740-03/03/00 


0.00 


2.38 


51758-03/03/00 


0.00 


0 



negative control 



10 
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Example 5 

Bioactivity of CryET33/CryET34 Fusion Protein Expressed in Cotton Plants 

In order to evaluate expression and bioactivity of CryET33/CryET34 fusion protein in a 

target plant, pMON51713 and pMON51719 were submitted for cotton transformation and plant 

regeneration (all constructs contained a NPTII selectable marker). 

The expression levels were determined for Ro plants by ELISA in fresh cotton leaf tissue, 

and then in fresh cotton squares. Several plants were determined to express levels of 

CryET33/CryET34 fusion protein above LC 50 values for CryET33/CryET34 fusion protein (1-5 

ppm). These results are presented in Table 15. 



Table 15. Expression of CryET33/CryET34 fusion protein in fresh cotton tissue* 



Plant 

(pMON-pIant number) 


ELISA value in leaf 
tissue, ppm 


ELISA value in square 
tissue, ppm 


51713-S011036 


3.40 


2.27 


51719-S011132 


8.22 


19.59 


51719-S011154 


5.52 


1.39 


51719-S011207 


4.93 


1.53 


51719-S011339 


8.94 


ND 


51719-S011470 


7.90 


ND 


51719-S011482 


6.43 


ND 


51719-S011480 


6.13 


ND 


51719-S011481 


5.44 


ND 


51719-S011664 


8.22 


ND 


51719-S011875 


13.97 


ND 


51719-S012091 


6.28 


ND 


51719-S012253 


6.53 


ND 



* (cotton leaf and cotton square tissues were sampled) 



Bioactivity of cotton squares expressing CryET33/CryET34 fusion protein against boll 
weevil for several available plants was tested using lyophilized tissue in diet-overlay bioassay 
(3% callus tissue in Agar). The results for plant SOI 1132 are presented in Table 16, which 
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demonstrates that plant SOI 1132 (pMON51719, ET33/ET34 fusion with BamHl/Nhel linker 
driven by e35S promoter) exhibits commercial levels of activity against boll weevil. The results 
further suggest the CryET33/CryET34 fusion proteins can be highly efficacious in cotton squares 
which are the primary targets of boll weevil infestation. 



Table 16. 

CryET33/CryET34 Fusion Protein Bioactivity Against Boll Weevil 



Sample 


Mortality, % 


Stunting, % 


C312 


7.7 


0 


51719-S011132 


60 


80.7 


ET33/ET34 PPM 


25 


78.9 



• Lyophilized cotton square tissue from plant SO 1 1 1 1 32 transformed with pMON5 1719 and demonstrated to 
be expressing 15.8 mg CryET33/34 fusion per mg of lyophilized tissue. 

• C3 12- Coker 3 12 background control. 

• -purified CryET33/CryET34 fusion protein (at 4 ppm) was mixed with Coker 3 12 lyophilized cotton 
callus as a positive control. 

Eleven six-week-old Rl plants selected after Agrobacterium mediated transformation 
with the plasmid pMON51719, i.e., containing an insecticidal fusion of CryET33 and CryET34 
linked in frame by a GSGGAS linker, were transferred to a growth chamber in which 
temperature and humidity conditions were precisely controlled. The plants exhibited a random 
range of expression levels. Two plants were observed to express no detectable insecticidal 
protein, and one plant was a expressed very low levels of the fusion protein. Four plants Coker 
C312 non-transgenic plants were used as negative controls. All plants were infested with adults 
boll weevils on a weekly basis for four weeks. Flaring squares from each plant were collected in 
individual plastic containers, and dissected after a period of three weeks in order to enumerate 
the number of larval and adult weevils. In all, each plant was sampled individually five times. 
Leaf and square tissue samples were obtained at the outset of the experiment and fusion protein 
levels were determined by ELISA. 

The results demonstrated in vivo activity of an ET33/ET34 fusion protein containing a 
GSGGAS linker against cotton boll weevils. The ELISA data collected from protein fractions 
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from leaf and square tissue samples from each plant tested correlated well with the observed 
bioactivity of the plants exhibiting the highest ELISA values. Boll retention also correlated well 
with the observed expression profiles, in that the plants exhibiting the greatest level of fusion 
protein expression as judged by ELISA were the plants least susceptible to boll drop upon weevil 
infestation. In this example, in order to mimic or exceed a field level high pressure infestation, 
the plants were subjected to four independent infestations of adult weevils. This artificial 
infestation level was much greater than the infestation that would typically be observed- under 
wild infestation conditions. 

One undesireable consequence of the expression of this particular ET33/34 fusion in this 
plant line was an aberrant plant phenotype. The cotton plants expressing the greatest levels of 
the ET33/ET34 fusion exhibited an obvious uncharacteristic phenotype. Plants exhibiting a 
lower level of expression had less severe symptoms, however, all plants derived from this 
transformation event exhibited some level of the observed symptoms. The principal 
morphological change observed in these plants was a swelling of the stems. In the most extreme 
cases there was a shorting of the internode distance resulting in slightly shorter stature. There 
did not seem to be any major impact on plant fertility. The observed phenotype could be specific 
to this particular transformation event and is likely attributable to the site of insertion of the 
cassette expressing the transgene. 

Example 6 -Fusion oftlClOO/tlClOl Insect Inhibitory Proteins 

The binary insecticidal toxin identified herein and designated as open reading frames 
producing the proteins tIClOO and tIClOl, is derived from Bacillus thuringiensis strain EG9328. 
The native Bacillus thuringiensis DNA sequence contained a frame-shift in the coding sequence 
for the tIC 100 protein. This frame-shift was altered by site-directed mutagenesis to produce the 
coding sequence as set forth in SEQ ID NO:l, which resulted in the generation of an operon 
which, when expressed in Bacillus thuringiensis strain EG10650 from plasmid pIC 10000 (strain 
sIC 1000), encodes a Coleopteran-inhibitory product comprising two proteins - tIClOO (29 kDa) 
andtlClOl (14 kDa). 

Therefore, tIClOO is a protein derived from a cryptic B. thuringiensis DNA sequence. 
The cryptic tIClOO coding sequence is a part of an operon containing the tIClOl coding 
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sequence, and is adjacent to and upstream of the coding sequence for tIClOl. The cryptic 
sequence upstream of tIClOl contains the complete coding sequence for tIClOO except that a 
single guanosine residue at position 84 of the native cryptic tIClOO coding sequence as set forth 
in SEQ ID NO:27 causes the tIClOO coding sequence to be out of frame. The frameshift was 
eliminated by removing the single guanosine residue at position 84 to create the novel tIClOO 
coding sequence as set forth in SEQ ID NO:l. Overlapping thermal amplification mutagenesis 
was employed to repair the tIClOO reading frame. Four oligonucleotides were synthesized to 
complete the reconstruction of a functional coding sequence for tIClOO. Two reverse 
complementary primers, SEQ ID NO:28 and SEQ ID NO:29, were synthesized which spanned 
the target site sequence, i.e., the guanosine residue to be removed from the cryptic B.t. sequence. 
Two additional primers were synthesized to take advantage of sequences downstream within the 
cryptic tIClOO coding sequence and upstream of the proposed promoter sequence for the operon. 
SEQ ID NO:30 is complementary to nucleotide positions 625-639 in tIClOO as shown in SEQ ID 
NO:l, and was used with SEQ ID NO:28 in a thermal amplification reaction with the cryptic 
tIC 100 as a template to produce a first product which contains the corrected sequence from just 
upstream of the frameshift correction point or target site sequence to just downstream of a unique 
Pstl site in the tIClOO coding sequence, located at nucleotide positions 247-252 of SEQ ID 
NO:l. The other oligonucleotide primer, SEQ ID NO:29, was used along with SEQ ID NO:31 in 
a thermal amplification reaction using the cryptic tIClOO sequence as a template to produce a 
second product which also contains the corrected sequence at one end and an EcoRI restriction 
site at the distal end of the product. The two amplification products were then mixed into a third 
thermal amplification reaction along with primers corresponding to SEQ ID NO: 30 and SEQ ID 
NO:31, denatured and then allowed to anneal, a portion of the annealed products representing 
one strand of the first product annealed at one end to the complementary end of one strand of the 
other amplification product. The overlap/annealed sequence from both products represents the 
reverse complementary sequences of SEQ ID NO:28 and SEQ ID NO:29. Elongation of those 
sequences in the thermal amplification reaction produced a sequence which was then amplified 
by the oligos represented by SEQ ID NO:30 and SEQ ID NO:31 to produce a third product, 
which was purified, digested with Pstl and EcoRl and inserted into the native cryptic sequence in 
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place of the native frame-shifted sequence to produce the novel functional sequence encoding the 
tIC 100 and tIClOl coleopteran inhibitory binary toxin peptides. 

The amino acid sequence of the tIClOO and tIClOl binary peptide toxin is similar to the 
amino acid sequence of the CryET33 and CryET34 binary peptide toxin. CryET33 is a 
comparative counterpart to CrytlClOO, and CryET34 is a comparative counterpart to CrytlClOl. 
The amino acid sequence of tIClOO was 74% identical to the amino acid sequence of CryET33, 
and the amino acid sequence of tIClOl was about 82.5% identical to the amino acid sequence of 
CryET34. It was postulated that tIClOO and tIClOl may share common structural and functional 
properties with CryET33 and CryET34 because of the similarity between the amino acid 
sequences of these proteins and that these proteins would have similar bioactivity. In fact, insect 
inhibitory assays using tIClOO and tIClOl herein and completed as described in Examples 9 and 
10 of U.S. Pat. No. 6,063,756 demonstrated insect inhibitory activity. 

In view of the insect inhibitory activity exhibited by the binary toxin protein CrytlClOO 
and CrytlClOl, and the similarities between the CryET33/CryET34 binary toxin protein, it was 
further postulated that a fusion protein could be constructed in a manner similar to those 
described in Examples 1 and 2 above. Several fusions were designed and constructed. Two of 
these fusions were designed similarly to the CryET33/CryET34 fusions. That is, the tIClOO and 
tIClOl proteins were fused in both orientations (i.e., tIClOO-tlClOl and tIC101~tIC100) and 
separated by a short hydrophilic linker (Gly-Ser-Gly-Gly-Ala-Ser). The nucleic acid sequence 
encoding the linker is embraced by unique BamHl and Nhel endonuclease restriction sites. Two 
other fusions were designed with a short Gly-Gly linker since this configuration more closely 
resembles the distance between the tIClOO and tIClOl sequences in the native B.t operon. 

These nucleotide sequences encoding the tIClOO/tlClOl fusions were made by 
overlapping thermal amplification mutagenesis, cloned into the B.t. expression vector 
pMON47407 and expressed in B.t. strain EG10650. Strain numbers have been assigned to these 
expression strains as indicated in Table 17. 
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TABLE 17. 

B.T. STRAINS CONTAINING PLASMIDS ENCODING ET33/ET34 AND TIC100/TIC101 
FUSIONS 



Strain number 


pMON# 


Description of Fusion Expression Cassette 


SIC2000 


38644 


ET33-GSGGAS-ET34 


SIC2001 


38646 


ET34-GSGGAS-ET33 


SIC2002 


38651 


ET33 -GSPALLKEAPRAEEELPP AS-ET34 


SIC2003 


38652 


ET33-(GGGS) 3 -ET34 


SIC2006 


38653 


tICl 00-GSGGAS-tICl 01 


SIC2007 


38654 


tIClOO-GG-tlClOl 


SIC2008 


38655 


tIClOl-GG-tlClOO 


SIC2010 


38657 


tIC 1 0 1 -GSGGAS-tIC 1 00 



The tIClOO/tlClOl fusions were expressed and identified within the spores-crystal 
fraction of sporulated BJ. expression strains. SDS-PAGE analysis revealed the presence of the 
band of expected size (44 kDa), which is not present in the host strain (EG10650) alone. 

The spores-crystal fraction suspensions of tIClOO/tlClOl fusions were quantitated using 
spot densitometry and submitted for a diet-overlay bioassay against boll weevil in parallel with 
CryET33/CryET34 fusions. These results are shown in Figure 3. Figure 3 demonstrates that the 
tIClOO/tlClOl fusions (sIC2006, sIC2007 and sIC2008) are approximately as active as the 
CryET3 3/CryET34 fusions (sIC2000 and sIC2001). 

All of the compositions and methods disclosed and claimed herein can be made and 
executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied to the 
compositions, methods and in the steps or in the sequence of steps of the methods described 
herein without departing from the concept, spirit and scope of the invention. More specifically, 
it will be apparent that certain agents which are both chemically and physiologically related may 
be substituted for the agents described herein while the same or similar results would be 
achieved. All such similar substitutes and modifications apparent to those skilled in the art are 
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deemed to be within the spirit, scope and concept of the invention as defined by the appended 
claims. Accordingly, the exclusive rights sought to be patented are as described in the claims 
below. 
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Claims: 

1. An isolated insecticidal polypeptide selected from the group consisting of SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, 
SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
NO:26, and SEQ ID NO:33. 

2. The polypeptide of claim 1 exhibiting insecticidal activity when provided in an orally 
acceptable insect diet to a susceptible Coleopteran insect or Coleopteran insect larva. 

3. The polypeptide of claim 2 exhibiting insecticidal activity when provided in an orally 
administrable diet to a susceptible Coleopteran insect or Coleopteran insect larva. 

4. The polypeptide of Claim 3 wherein said Coleopteran insect is a cotton boll weevil and 
said Coleopteran insect larva is a cotton boll weevil larva. 

5. A composition comprising an insecticidally effective amount of the polypeptide of claim 
1 wherein said composition is a bacterial cell comprising a polynucleotide sequence that encodes 
said polypeptide, said composition being selected from the group consisting of a cell extract, cell 
suspension, cell homogenate, cell lysate, cell supernatant, cell filtrate, or cell pellet. 

6. The composition of claim 5 wherein said bacterial cell is a bacterial species selected from 
the group consisting of Bacillus, Escherichia, Salmonella, Agrobacterium, and Pseudomonas. 

7. The composition of claim 6 wherein said bacterial cell is selected from the group 
consisting of sIOOOO, sIC2000, sIC2001, sIC2002, sIC2003, sIC2006, sIC2007, sIC2008, and 
sIC2010 bacterial cells. 

8. A composition comprising an insecticidally effective amount of the polypeptide of claim 
1 wherein said composition is formulated as a powder, dust, pellet, granule, spray, emulsion, 
colloid, or solution. 
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9. The composition according to claim 5, prepared by desiccation, lyophilization, 
homogenization, extraction, filtration, centrifugation, sedimentation, or concentration. 

10. The composition of claim 9 wherein said polypeptide is present in a concentration of 
from about 0.001 % to about 99% by weight. 

11. An isolated polynucleotide sequence encoding an insecticidal polypeptide, wherein said 
polynucleotide is selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID 
NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, and SEQ ID NO:32, 
and biologically functional equivalents thereof. 

12. The polynucleotide sequence of Claim 11 wherein said polypeptide exhibits Coleopteran 
insecticidal activity when provided orally to a susceptible Coleopteran insect or Coleopteran 
insect larva. 

13. The polynucleotide sequence of Claim 12 wherein said polypeptide exhibits Coleopteran 
insecticidal activity when provided in an orally administrable diet or composition to a 
Coleopteran insect or Coleopteran insect larva. 

14. The polynucleotide sequence of Claim 13 wherein said Coleopteran insect is a cotton boll 
weevil and said Coleopteran insect larva is a cotton boll weevil larva. 

15. A polynucleotide sequence which is or is complementary to the polynucleotide sequence 
of Claim 14 and which hybridizes under stringent conditions to a polynucleotide sequence 
complementary to or encoding a polypeptide, said polypeptide being selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 
ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, 
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SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and biologically functional equivalents 
thereof. 

16. A method for protecting a cotton plant from boll weevil infestation comprising 
providing to a boll weevil in its diet a plant transformed to express a protein toxic to said weevil 
wherein said protein is expressed in sufficient amounts in said plant's tissues to control boll 
weevil infestation of said plant and wherein said protein is selected from the group consisting of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, and SEQ ID NO:33, and biologically functional equivalents thereof. 

17. A method for protecting a cotton plant from boll weevil infestation comprising providing 
to a boll weevil in its diet a plant or plant tissue transformed to express one or more proteins 
toxic to said weevil wherein said proteins are expressed in sufficient amounts alone or in 
combination to control boll weevil infestation and wherein said proteins are selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6 5 SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:24, and SEQ ID NO:26, and SEQ ID NO:33, and biologically functional 
equivalents thereof. 

18. A vector for use in transforming a host cell, wherein said vector comprises a 
polynucleotide sequence encoding an insecticidal polypeptide, said polypeptide selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and biologically functional 
equivalents thereof.. 

19. The vector of claim 18, wherein said vector is selected from the group consisting of 
plasmid pMON38644, plasmid pMON38646, plasmid pMON38651, plasmid pMON38652, 
plasmid pMON38653, plasmid pMON38654, plasmid pMON38655, plasmid pMON38657, 
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plasmidpMON51713 ? plasmidpMON51719,plasmidpMON51739 3 plasmid pMON51740, and 
plasmidpMON51758. 

20. The vector of claim 18 wherein said host cell is selected from the group consisting of a 
plant cell and a bacterial cell. 

21. A plant tissue transformed with a polynucleotide sequence which expresses the 
polypeptide of Claim 1, wherein said tissue is selected from the group consisting of a plant cell, 
an embryonic plant tissue, plant calli, a leaf, a plant stem, a plant root, a plant flower, a fruit, a 
fruiting body, a boll, and a plant seed. 

22. The plant tissue of claim 21 wherein said tissue comprises said polypeptide present in a 
Coleopteran insect inhibitory effective amount. 

23. The plant tissue of claim 22 wherein said Coleopteran insect is a cotton boll weevil. 

24. A plant regenerated from the tissue of claim 21 wherein said plant is selected from the 
group of plants consisting of corn, wheat, cotton, soybean, oat, rice, rye, sorghum, sugarcane, 
tomato, tobacco, kapok, flax, potato, barley, turf grass, pasture grass, berry bush, fruit tree, 
legume, vegetable, ornamental plant, shrub, cactus, succulent, deciduous tree, and evergreen tree. 

25. A method of making a transgenic plant resistant to Coleopteran insect infestation 
comprising the steps of: 

a) incorporating into a genome of a plant cell a polynucleotide comprising a plant 
functional promoter sequence operably linked to a nucleotide sequence encoding a 
Coleopteran insecticidal polypeptide; 

b) isolating and propagating a plant cell transformed with said polynucleotide; 

c) regenerating a plant from said plant cell transformed with said polynucleotide; and 

d) propagating said plant; 
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wherein said plant expresses an insecticidally effective amount of said polypeptide from 
said polynucleotide, and wherein said polypeptide is selected from the group consisting 
of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID 
NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID 
NO:22, SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and biologically functional 
equivalents thereof. 

26. The method of claim 25 wherein said plant cell is either a monocot or a dicot plant cell. 

27. The method of claim 26 wherein said monocot plant cell is selected from the group of 
plant cells consisting of com, wheat, rye, barley, rice, banana, sugarcane, oat, flax, turf grass, 
pasture grass, and sorghum cells. 

28. The method of claim 26 wherein said dicot plant cell is selected from the group of plant 
cells consisting of cotton, soybean, canola, potato, tomato, fruit tree, shrub, vegetable, and berry 
cells. 

29. An isolated and purified antibody which specifically binds to a peptide selected from the 
group of peptides consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, 
SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID 
NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and 
immunologically detectable variants thereof, or an epitope therein, said antibody produced from 
the immune system of a vertebrate animal in response to the exposure of all or an antigenic part 
of said peptide to the animal's immune system. 

30. A method for detecting the presence of a peptide in a sample comprising obtaining a 
solution suspected of containing said peptide, probing said solution with the antibody of claim 
29, and detecting the binding of said antibody to said peptide; wherein said peptide is selected 
from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 
ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, 
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SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and immunologically 
detectable variants thereof. 

31. A kit for detecting the presence of the peptide in a sample comprising, in suitable 
container means, an antibody that binds to said peptide, reagents necessary for mixing the 
peptide and antibody in a solution, at least a first immunodetection reagent providing said 
antibody along with control antibody, control antigen, and the reagents and instructions 
necessary for detecting said binding; wherein said peptide is selected fro the group consisting of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, and SEQ ID NO:33, and immunologically detectable variants thereof. 

32. A plant cell transformed with a polynucleotide sequence that expresses one or more of 
the polypeptides as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, 
SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID 
NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, and SEQ ID NO:33, and insecticidal 
variants thereof, wherein said cell produces an amount of said one or more polypeptides effective 
for controlling a Coleopteran insect pest infestation. 

33. The plant cell of claim 32 wherein said Coleopteran insect pest is a cotton boll weevil and 
said plant cell is a cotton plant cell. 

34. A method of making a host cell resistant to Coleopteran insect pest infestation 
comprising the steps of: 

a) transforming said host cell with a polynucleotide sequence encoding a Coleopteran 
insect inhibitory peptide; and 

b) selecting a host cell expressing said inhibitory peptide; 

wherein said inhibitory peptide is selected from the group consisting of SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID 
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NO:16, SEQ ID NO:18 5 SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, and 
SEQ ID NO:33, and biologically functional equivalents thereof. 

35. The method of claim 34, wherein said Coleopteran insect pest is a cotton boll weevil and 
said host cell is a cotton plant cell. 

36. An insecticidal composition comprising SEQ ID NO:2 and SEQ ID NO:4. 

37. An insecticidal composition according to claim 36 further comprising any one of the 
polypeptides selected from the group consisting of SEQ ID NO:12 5 SEQ ID NO: 14, SEQ ID 
NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and SEQ ID NO:26, 
and biologically functional equivalents thereof. 
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SEQUENCE LISTING 

<110> Monsanto Company 
Gouzov, Victor 
5 Roberts, James 

Sivasupramaniam, Sakuntala 
Malvar, Thomas 

<12 0> Novel Coleopteran Active Insect Inhibitory Proteins and Methods of Use 
10 Therefor 

<130> 38-21(51382) - MOBT :227P 

^ <150> 60/232,099 
15 <151> 2000-09-12 

<160> 33 

<170> Patentln version 3.0 

20 

<210> 1 

<211> 804 

<212> DNA 

<213> Artificial 

<220> 

<221> CDS 

<222> (1) . . (804) 

<223> tIClOO coding sequence 

<400> 1 

atg gga att ate aac att caa gac gaa att aat gac tac atg aaa ggt 4 8 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
1 5 10 15 

atg tat ggt gca aca tct gtt aaa age act tat gac ccc tea ttc aaa 96 
Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

40 gta ttt aac gaa tct gtg aca cct caa tat gat gtg att cca aca gaa 144 
Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 
35 40 45 

cct gta aat aat cat att act act aaa gta ata gat aat cca ggg act 192 
45 Pro Val Asn Asn His lie Thr Thr Lys Val He Asp Asn Pro Gly Thr 
50 55 60 

tea gaa gta ace agt aca gta acg ttc aca tgg acg gaa acc gac act 240 
Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
50 65 70 75 80 

gta acc tct gca gtg act aaa ggg tat aaa gtc ggt ggt tea gta age 2 88 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
85 90 95 



25 



30 



35 



55 



tea aaa gca act ttt aaa ttt get ttt gtt act tct gat gtt act gta 336 
Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
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100 105 110 

act gta tea gca gaa tat aat tat agt aca aca gaa aca aca aca aaa 3 84 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
5 115 120 125 

aca gat aca cgc aca tgg acg gat teg acg aca gta aaa gec cct cca 432 
Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

10 

aga act aat gta gaa gtt gca tat att ate caa act gga aat tat aac 4 80 

Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr Gly Asn Tyr Asn 
145 150 155 160 

15 gtt ccg gtt aat gta gag tct gat atg act gga acg eta ttt tgc aga 528 
Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
165 170 175 

ggg tat aga gat ggt gca eta att gca gcg get tat gtt tct ata aca 576 
20 Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr Val Ser lie Thr 
180 185 190 

gat tta gca gat tac aat cct aat ttg ggt ctt aca aat gaa ggg aat 624 
Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
25 195 200 ~ 205 

ggg gtt get cat ttt aaa ggt gaa ggt tat ata gag ggt gcg caa ggc 672 
Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu Gly Ala Gin Gly 
210 215 220 

30 

tta aga age tac att caa gtt aca gaa tat cca gtg gat gat aat ggc 72 0 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
225 230 235 240 

35 aga cat teg ata cca aaa act tat ata att aaa ggt tea tta gca ccc 768 
Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 
245 250 255 

aat gtt act tta ata aat gat aga aag gaa ggt aga 8 04 

40 Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly Arg 
260 265 



<210> 2 

45 <211> 268 

<212> PRT 

<213> Artificial 



50 



<400> 2 

Met Gly He lie Asn He Gin Asp Glu He Asn Asp Tyr Met Lys Gly 
1 5 10 15 



Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
55 20 25 30 
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Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val He Pro Thr Glu 
35 40 45 

5 

Pro Val Asn Asn His He Thr Thr Lys Val He Asp Asn Pro Gly Thr 
50 55 60 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
10 65 70 75 80 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
85 90 95 

15 Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
100 105 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

20 

Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

Arg Thr Asn Val Glu Val Ala Tyr He He Gin Thr Gly Asn Tyr Asn 
25 145 150 155 160 

Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
165 170 175 

30 Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Ala Tyr Val Ser He Thr 
180 185 190 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

35 

Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu Gly Ala Gin Gly 
210 215 220 

Leu Arg Ser Tyr He Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
40 225 230 235 240 

Arg His Ser He Pro Lys Thr Tyr He lie Lys Gly Ser Leu Ala Pro 
245 250 255 

45 Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly Arg 
260 265 



<210> 


3 


<211> 


378 


50 <212> 


DNA 


<213> 


Bacillus thuringiensis 


<220> 




<221> 


CDS 


55 <222> 


(1) . . (378) 
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<223> tIClOl 
<400> 3 

atg aca gta tat aac gta act ttt acc att aaa ttc tat aat gaa ggt 48 
5 Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
15 10 15 

gaa tgg ggg ggg cca gaa cct tac ggt aag at a tat gca tac ctt caa 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
10 20 25 30 

aat cca gat cat aat ttc gaa att tgg tea caa gat aat tgg ggg aag 144 
Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

15 

gat acg cct gag aaa agt tct cac act caa aca att aaa ata agt age 192 
Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

20 cca aca ggg ggg cct ata aac caa atg tgt ttt tat ggt gat gta aaa 240 
Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

gaa tac gac gta gga aat gca gat gat gtt etc gec tat cca agt caa 2 88 

25 Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

aaa gta tgc agt acg cct ggc aca aca ata agg ctt aac gga gat gag 336 
Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
30 100 105 110 

aaa ggt tct tat ata cag att aga tat tec ttg gee cca get 3 78 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
115 120 125 

35 

<210> 4 
<211> 126 
<212> PRT 
40 <213> Bacillus thuringiensis 

<400> 4 

Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
45 1 5 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
20 25 30 

50 Asn Pro Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys He Ser Ser 
50 55 60 

55 
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Pro Thr Gly Gly Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 
5 85 90 95 

Lys Val Cys Ser Thr Pro Gly Thr Thr He Arg Leu Asn Gly Asp Glu 
100 105 110 

10 Lys Gly Ser Tyr He Gin He Arg Tyr Ser Leu Ala Pro Ala 
115 120 125 

<210> 5 
<211> 1188 
15 <212> DNA 

<213> artificial 

<220> 

<221> CDS 

20 <222> (1) . . (1188) 

<223> tIClOl-GG-tlClOO fusion 

<400> 5 

atg aca gta tat aac gta act ttt acc att aaa ttc tat aat gaa ggt 48 
25 Met Thr Val Tyr Asn Val Thr Phe Thr He Lys Phe Tyr Asn Glu Gly 
15 10 15 

gaa tgg ggg ggg cca gaa cct tac ggt aag ata tat gca tat ctt caa 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr Leu Gin 
30 20 25 30 

aat cca gat cat aat ttc gaa att tgg tea caa gat aat tgg ggg aag 144 

Asn Pro Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

35 

gat acg cct gag aaa agt tct cac act caa aca att aaa ata agt age 192 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr He Lys He Ser Ser 
50 55 60 

40 cca aca ggg ggg cct ata aac caa atg tgt ttt tat ggt gat gta aaa 240 
Pro Thr Gly Gly Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

gaa tac gac gta gga aat gca gat gat gtt etc gcc tat cca agt caa 2 88 

45 Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

aaa gta tgc agt acg cct ggc aca aca ata agg ctt aac gga gat gag 336 
Lys Val Cys Ser Thr Pro Gly Thr Thr He Arg Leu Asn Gly Asp Glu 
50 100 105 110 

aaa ggt tct tat ata cag att aga tat tec ttg gcc cca get ggt gga 3 84 

Lys Gly Ser Tyr He Gin He Arg Tyr Ser Leu Ala Pro Ala Gly Gly 
115 12 0 12 5 

55 
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atg gga att ate aac att caa gac gaa att aat gac tac atg aaa ggt 432 
Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
13 0 135 14 0 

5 atg tat ggt gca aca tct gtt aaa age act tat gac ccc tea ttc aaa 480 
Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
145 150 155 ~ 160 

gta ttt aac gaa tct gtg aca cct caa tat gat gtg att cca aca gaa 528 
10 Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 

165 170 175 

cct gta aat aat cat att act act aaa gta ata gat aat cca ggg act 576 
Pro Val Asn Asn His lie Thr Thr Lys Val lie Asp Asn Pro Gly Thr 
15 180 185 190 

tea gaa gta ace agt aca gta acg ttc aca tgg acg gaa acc gac act 624 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
195 200 205 

20 

gta acc tct gca gtg act aaa ggg tat aaa gtc ggt ggt tea gta age 6 72 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
210 215 220 

25 tea aaa gca act ttt aaa ttt get ttt gtt act tct gat gtt act gta 72 0 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
225 230 235 240 

act gta tea gca gaa tat aat tat agt aca aca gaa aca aca aca aaa 768 
30 Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 

245 250 255 

aca gat aca cgc aca tgg acg gat teg acg aca gta aaa gee cct cca 816 
Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
35 260 265 270 

aga act aat gta gaa gtt gca tat att ate caa act gga aat tat aac 864 
Arg Thr Asn Val Glu Val Ala Tyr lie He Gin Thr Gly Asn Tyr Asn 
275 280 285 

40 

gtt ccg gtt aat gta gag tct gat atg act gga acg eta ttt tgc aga 912 
Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
290 295 300 

45 ggg tat aga gat ggt gca eta att gca gcg get tat gtt tct ata aca 960 
Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Ala Tyr Val Ser He Thr 
305 310 315 320 

gat tta gca gat tac aat cct aat ttg ggt ctt aca aat gaa ggg aat 1008 
50 Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 

325 330 335 

999 gtt get cat ttt aaa ggt gaa ggt tat ata gag ggt gcg caa ggc 1056 
Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu Gly Ala Gin Gly 
55 340 345 350 
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tta aga age tac att caa gtt aca gaa tat cca gtg gat gat aat ggc 1104 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 

355 360 365 

5 

aga cat teg ata cca aaa act tat ata att aaa ggt tea tta gca ccc 1152 

Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 

370 375 380 

10 aat gtt act tta ata aat gat aga aag gaa ggt aga 1188 
Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg 
385 390 395 

15 <210> 6 

<211> 396 

<212> PRT 

<213> artificial 

20 <400> 6 

Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
15 10 15 

25 Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr Leu Gin 
20 25 30 

Asn Pro Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

30 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr He Lys He Ser Ser 
50 55 60 

Pro Thr. Gly Gly Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
35 65 70 75 80 

Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 
85 90 95 

40 Lys Val Cys Ser Thr Pro Gly Thr Thr He Arg Leu Asn Gly Asp Glu 
100 105 110 

Lys Gly Ser Tyr He Gin He Arg Tyr Ser Leu Ala Pro Ala Gly Gly 
115 120 125 

45 

Met Gly He He Asn He Gin Asp Glu He Asn Asp Tyr Met Lys Gly 
130 135 140 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
50 1 45 150 1 55 1 60 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val He Pro Thr Glu 
165 170 175 

55 Pro Val Asn Asn His He Thr Thr Lys Val He Asp Asn Pro Gly Thr 
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180 185 190 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
195 200 205 

5 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
210 215 220 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
10 225 230 235 240 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
245 250 255 

15 Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
260 265 270 

Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr Gly Asn Tyr Asn 
275 280 285 

20 

Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
290 295 300 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr Val Ser lie Thr 
25 305 310 315 320 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
325 330 335 

30 Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu Gly Ala Gin Gly 
340 345 350 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
355 360 365 

35 

Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 
370 375 380 

Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg 
40 385 390 395 



<210> 7 

<211> 1200 

45 <212> DNA 

<213> artificial 

<220> 

<221> CDS 

50 <222> (1) . . (1200) 

<223> tIClOO-GSGGAS-tlClOl 

<400> 7 

at 9" gga att ate aac att caa gac gaa att aat gac tac atg aaa ggt 48 
55 Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
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1 5 10 15 

atg tat ggt gca aca tct gtt aaa age act tat gac ccc tea ttc aaa 96 
Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
5 20 25 3 0 

gta ttt aac gaa tct gtg aca cct caa tat gat gtg att cca aca gaa 144 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 

35 40 45 

10 

cct gta aat aat cat att act act aaa gta ata gat aat cca ggg act 192 

Pro Val Asn Asn His lie Thr Thr Lys Val He Asp Asn Pro Gly Thr 
50 55 60 

15 tea gaa gta acc agt aca gta acg ttc aca tgg acg gaa acc gac act 240 
Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
65 70 75 80 

gta acc tct gca gtg act aaa ggg tat aaa gtc ggt ggt tea gta age 288 
20 Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 

85 90 ~ 95 

tea aaa gca act ttt aaa ttt get ttt gtt act tct gat gtt act gta 336 
Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
25 100 105 ~ 110 

act gta tea gca gaa tat aat tat agt aca aca gaa aca aca aca aaa 384 
Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

30 

aca gat aca cgc aca tgg acg gat teg acg aca gta aaa gee cct cca 432 
Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
13 0 13 5 14 0 

35 aga act aat gta gaa gtt gca tat att ate caa act gga aat tat aac 48 0 

Arg Thr Asn Val Glu Val Ala Tyr He He Gin Thr Gly Asn Tyr Asn 
145 150 155 160 

gtt ccg gtt aat gta gag tct gat atg act gga acg eta ttt tgc aga 528 
40 Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 

165 170 175 

ggg tat aga gat ggt gca eta att gca gcg get tat gtt tct ata aca 576 
Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Ala Tyr Val Ser He Thr 
45 180 185 190 

gat tta gca gat tac aat cct aat ttg ggt ctt aca aat gaa ggg aat 624 
Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

50 

ggg gtt get cat ttt aaa ggt gaa ggt tat ata gag ggt gcg caa ggc 672 
Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu Gly Ala Gin Gly 
210 215 ~ 220 

55 tta aga age tac att caa gtt aca gaa tat cca gtg gat gat aat ggc 72 0 
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Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
225 230 235 240 

aga cat teg ata cca aaa act tat ata att aaa ggt tea tta gca ccc 768 
5 Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 

245 250 255 

aat gtt act tta ata aat gat aga aag gaa ggt aga gga tec ggt gga 816 
Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg Gly Ser Gly Gly 
10 260 265 270 

get age atg aca gta tat aac gta act ttt ace att aaa ttc tat aat 864 
Ala Ser Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn 
275 280 285 

15 

gaa ggt gaa tgg ggg ggg cca gaa cct tac ggt aag ata tat gca tat 912 
Glu Gly Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr 
290 295 300 

20 ctt caa aat cca gat cat aat ttc gaa att tgg tea caa gat aat tgg 960 
Leu Gin Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp 
305 310 315 320 

ggg aag gat acg cct gag aaa agt tct cac act caa aca att aaa ata 1008 
25 Gly Lys Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie 

325 330 335 

agt age cca aca ggg ggg cct ata aac caa atg tgt ttt tat ggt gat 1056 
Ser Ser Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp 
30 340 345 350 

gta aaa gaa tac gac gta gga aat gca gat gat gtt etc gee tat cca 1104 
Val Lys Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro 
355 360 365 

35 

agt caa aaa gta tgc agt acg cct ggc aca aca ata agg ctt aac gga 1152 
Ser Gin Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly 
370 375 380 

40 gat gag aaa ggt tct tat ata cag att aga tat tec ttg gee cca get 12 0 0 

Asp Glu Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
385 390 395 400 



45 <210> 8 

<211> 400 

<212> PRT 

<213> artificial 

50 <400> 8 

Met Gly He He Asn He Gin Asp Glu He Asn Asp Tyr Met Lys Gly 
1 5 10 15 



55 Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
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20 25 30 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val He Pro Thr Glu 
35 40 45 

5 

Pro Val Asn Asn His He Thr Thr Lys Val lie Asp Asn Pro Gly Thr 
50 55 60 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
10 65 70 75 80 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
85 90 95 

15 Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
100 105 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

20 

Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

Arg Thr Asn Val Glu Val Ala Tyr lie He Gin Thr Gly Asn Tyr Asn 
25 145 1 50 1 55 1 60 

Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
165 170 175 

30 Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr Val Ser He Thr 
180 185 190 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

35 

Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu Gly Ala Gin Gly 
210 215 220 

Leu Arg Ser Tyr He Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
40 2 2 5 2 3 0 2 3 5 2 4 0 

Arg His Ser He Pro Lys Thr Tyr He He Lys Gly Ser Leu Ala Pro 
245 250 255 

45 Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly Arg Gly Ser Gly Gly 
260 265 270 

Ala Ser Met Thr Val Tyr Asn Val Thr Phe Thr He Lys Phe Tyr Asn 
275 280 285 

50 

Glu Gly Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr 
290 295 300 



Leu Gin Asn Pro Asp His Asn Phe Glu He Trp Ser Glh Asp Asn Trp 
55 305 310 315 320 
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Gly Lys Asp Thr Pro 
325 

5 Ser Ser Pro Thr Gly 
340 

Val Lys Glu Tyr Asp 
355 

10 

Ser Gin Lys Val Cys 
370 

Asp Glu Lys Gly Ser 
15 385 



Glu Lys Ser Ser His Thr 
330 

Gly Pro lie Asn Gin Met 
345 

Val Gly Asn Ala Asp Asp 
360 

Ser Thr Pro Gly Thr Thr 
375 

Tyr He Gin He Arg Tyr 
390 395 



Gin Thr He Lys He 
335 

Cys Phe Tyr Gly Asp 
350 

Val Leu Ala Tyr Pro 
365 

He Arg Leu Asn Gly 
380 

Ser Leu Ala Pro Ala 
400 



<210> 9 

<211> 1188 

<212> DNA 

20 <213> artificial 



<220> 

<221> CDS 

<222> (1) . . (1188) 

25 <223> tIClOO-GG-tlClOl 



<400> 9 

a-tg 993- sitt ate aac att caa gac gaa att aat gac tac atg aaa ggt 48 

Met Gly He He Asn He Gin Asp Glu He Asn Asp Tyr Met Lys Gly 
30 1 5 10 15 



atg tat ggt gca aca tct gtt aaa age act tat gac ccc tea ttc aaa 96 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 

20 25 30 

35 

gta ttt aac gaa tct gtg aca cct caa tat gat gtg att cca aca gaa 144 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val He Pro Thr Glu 
35 40 45 



40 cct gta aat aat cat att act act aaa gta ata gat aat cca ggg act 192 
Pro Val Asn Asn His He Thr Thr. Lys Val He Asp Asn Pro Gly Thr 
50 55 60 

tea gaa gta acc agt aca gta acg ttc aca tgg acg gaa acc gac act 240 
45 Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
65 70 75 80 



gta acc tct gca gtg act aaa ggg tat aaa gtc ggt ggt tea gta age 2 88 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 

50 85 90 95 

tea aaa gca act ttt aaa ttt get ttt gtt act tct gat gtt act gta 33 6 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 

100 105 110 

55 
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act gta tea gca gaa tat aat tat agt aca aca gaa aca aca aca aaa 3 84 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

5 aca gat aca cgc aca tgg acg gat teg acg aca gta aaa gec cct cca 432 
Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

aga act aat gta gaa gtt gca tat att ate caa act gga aat tat aac 48 0 

10 Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr Gly Asn Tyr Asn 
145 150 155 160 

gtt ccg gtt aat gta gag tct gat atg act gga acg eta ttt tgc aga 52 8 

Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr Leu Phe Cys Arg 
15 165 170 175 

ggg tat aga gat ggt gca eta att gca gcg get tat gtt tct ata aca 576 
Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Ala Tyr Val Ser He Thr 
180 185 190 

20 

gat tta gca gat tac aat cct aat ttg ggt ctt aca aat gaa ggg aat 624 
Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

25 999 gtt. get cat ttt aaa ggt gaa ggt tat ata gag ggt gcg caa ggc 672 
Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu Gly Ala Gin Gly 
210 215 220 

tta aga age tac att caa gtt aca gaa tat cca gtg gat gat aat ggc 720 
30 Leu Arg Ser Tyr He Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
225 230 235 240 

aga cat teg ata cca aaa act tat ata att aaa ggt tea tta gca ccc 768 
Arg His Ser He Pro Lys Thr Tyr He He Lys Gly Ser Leu Ala Pro 
35 2 4 5 2 5 0 2 5 5 

aat gtt act tta ata aat gat aga aag gaa ggt aga ggt gga atg aca 816 
Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly Arg Gly Gly Met Thr 
260 , 265 270 

40 

gta tat aac gta act ttt ace att aaa ttc tat aat gaa ggt gaa tgg 864 
Val Tyr Asn Val Thr Phe Thr He Lys Phe Tyr Asn Glu Gly Glu Trp 
275 280 285 

45 g99 ggg. cc a gaa cct tac ggt aag ata tat gca tat ctt caa aat cca 912 
Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr Leu Gin Asn Pro 
290 295 300 

gat cat aat ttc gaa att tgg tea caa gat aat tgg ggg aag gat acg 960 
50 Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys Asp Thr 
305 310 315 320 

cct gag aaa agt tct cac act caa aca att aaa ata agt age cca aca 10 08 

Pro Glu Lys Ser Ser His Thr Gin Thr He Lys He Ser Ser Pro Thr 
55 3 2 5 3 3 0 3 3 5 
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599 599 cct ata aac caa atg tgt ttt tat ggt gat gta aaa gaa tac 1056 

Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu Tyr 

340 345 350 

5 

gac gta gga aat gca gat gat gtt etc gec tat cca agt caa aaa gta 1104 

Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin Lys Val 

355 360 365 

10 tgc agt acg cct ggc aca aca ata agg ctt aac gga gat gag aaa ggt 1152 

Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu Lys Gly 

370 375 * 380 

tct tat ata cag att aga tat tec ttg gec cca get 1188 

15 Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
385 390 395 



<210> 10 

20 <211> 396 

<212> PRT 

<213> artificial 

<400> 10 

25 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp Tyr Met Lys Gly 
15 10 15 

Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
30 2 0 2 5 3 0 

Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val lie Pro Thr Glu 
35 40 45 

35 Pro Val Asn Asn His He Thr Thr Lys Val He Asp Asn Pro Gly Thr 
50 55 60 

Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr Glu Thr Asp Thr 
65 70- 75 80 

40 

Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly Gly Ser Val Ser 
85 90 95 

Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser Asp Val Thr Val 
45 100 105 110 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu Thr Thr Thr Lys 
115 120 125 

50 Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val Lys Ala Pro Pro 
130 135 140 

Arg Thr Asn Val Glu Val Ala Tyr He He Gin Thr Gly Asn Tyr Asn 
145 150 155 160 

55 
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Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thx Leu Phe Cys Arg 
165 170 175 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr Val Ser lie Thr 
5 180 . 185 190 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr Asn Glu Gly Asn 
195 200 205 

10 Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu Gly Ala Gin Gly 
210 215 220 

Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val Asp Asp Asn Gly 
225 230 235 240 

15 

Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly Ser Leu Ala Pro 
245 250 255 

Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly Arg Gly Gly Met Thr 
20 2 6 0 2 6 5 2 7 0 

Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly Glu Trp 
275 280 285 

25 Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin Asn Pro 
290 295 300 

Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys Asp Thr 
305 310 315 320 

30 

Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser Pro Thr 
325 330 335 

Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu Tyr 
35 3 4 0 3 4 5 3 5 0 

Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin Lys Val 
355 360 365 

40 Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu Lys Gly 
370 375 380 

Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala 
385 390 395 

45 

<210> 11 
<211> 1227 
<212> DNA 
<213> artificial 

50 

<220> 

<221> CDS 

<222> (1) . . (1227) 

<223> ET33-GGGS3-ET34 

55 
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<400> 11 

atg ggt ate ate aac att caa gat gag att aac aat tac atg aag gaa 48 
Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 
1 5 10 15 

5 

gtt tac ggt get act act gtt aag tct act tac gat cct tct ttc aag 96 
Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

10 gtt ttc aat gaa tct gtt act cct caa ttc act gaa att cct act gaa 144 
Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 45 

cct gtc aac aac cag ctt act act aag agg gtc gac aat act ggt tct 192 
15 Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

tac cct gtt gaa tct act gtt tct ttc act tgg act gaa act cat act 240 
Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
20 . 65 70 75 80 

gaa act tct get gtt act gaa ggt gtt aag get ggt act tct att tct 288 
Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 
85 90 95 

25 

act aag caa tct ttc aag ttc ggt ttc gtg aac tct gat gtt act ctt 33 6 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 

30 act gtt tct get gag tac aac tac tct act act aac act act act act 3 84 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 

act gaa act cat act tgg tct gat tct act aag gtt act att cct cct 432 
35 Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

aag act tac gtt gaa get get tac ate ate cag aat ggt act tac aat 48 0 

Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn Gly Thr Tyr Asn 
40 145 150 155 160 

gtt cct gtt aat gtt gaa tgc gat atg tct ggt act ctg ttc tgt cga 528 
Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 
165 170 175 

45 

ggt tat cgt gat ggt get ctt att get get gtt tac gtt tct gtt get 576 
Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 

50 gat ctt get gat tac aat cct aat ctt aat ctt act aat aag ggt gat 624 
Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

ggt att get cat ttc aag ggt tct gga ttc att gaa ggt get caa ggt 672 
55 Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu Gly Ala Gin Gly 
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210 215 220 

ctt aga tct gtg ate caa gtt act gaa tac cct ctt gat gat aat aag 720 
Leu Arg Ser Val lie Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
5 225 230 235 240 

ggt agg tct act cct att acg tac ctt ate aac ggt tct ctt get cct 768 
Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly Ser Leu Ala Pro 
245 250 255 

10 

aat gtt act ctt aag aat tct aat att aag ttc gga tec ggt gga ggt 816 
Asn Val Thr Leu Lys Asn Ser Asn lie Lys Phe Gly Ser Gly Gly Gly 
260 265 270 

15 tec ggt gga ggt tec ggt gga ggt tec get age atg act gtg tac aat 864 
Ser Gly Gly Gly Ser Gly Gly Gly Ser Ala Ser Met Thr Val Tyr Asn 
275 280 285 

get act ttc act ate aac ttt tac aat gaa ggt gaa tgg ggt ggt cct 912 
20 Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly Glu Trp Gly Gly Pro 
290 295 300 

gaa cct tac ggt tac ate aag gca tac ctt act aat cct gat cat gat 960 
Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr Asn Pro Asp His Asp 
25 305 310 315 320 

ttc gag att tgg aag caa gat gat tgg ggt aag tct act cct gag agg 1008 
Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys Ser Thr Pro Glu Arg 
325 330 335 

30 

tct act tac act caa act att aag ata tct tct gat act ggt tct cct 1056 
Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser Asp Thr Gly Ser Pro 
340 345 350 

35 ate aac cag atg tgc ttc tac ggt gac gtc aag gaa tac gat gtc ggc 1104 
lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu Tyr Asp Val Gly 
355 360 365 

aac get gat gat att ctt get tac cct tct caa aag gtt tgc tct act 1152 
40 Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin Lys Val Cys Ser Thr 
370 375 380 

cct ggt gtt act gtt agg ctt gat ggt gat gag aag ggt tct tac gtt 12 0 0 

Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys Gly Ser Tyr Val 
45 385 390 395 400 

act att aag tac tct ctt act cct get 122 7 

Thr lie Lys Tyr Ser Leu Thr Pro Ala 
405 



50 



<210> 12 

<211> 409 

<212> PRT 

<213> artificial 
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<400> 12 



5 Met Gly lie lie Asn lie Gin Asp 
1 5 

Val Tyr Gly Ala Thr Thr Val Lys 
20 

10 

Val Phe Asn Glu Ser Val Thr Pro 
35 40 

Pro Val Asn Asn Gin Leu Thr Thr 
15 50 55 



Glu lie Asn Asn Tyr Met Lys Glu 
10 15 

Ser Thr Tyr Asp Pro Ser Phe Lys 
25 30 

Gin Phe Thr Glu lie Pro Thr Glu 
45 

Lys Arg Val Asp Asn Thr Gly Ser 
60 



Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 

65 70 75 80 

20 Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 

85 90 95 



Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 

25 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 



Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
30 130 135 140 

Lys Thr Tyr Val Glu Ala Ala Tyr lie He Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

35 Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

165 170 175 



Gly Tyr Arg Asp Gly Ala Leu He 
180 

40 

Asp Leu Ala Asp Tyr Asn Pro Asn 
195 200 



Ala Ala Val Tyr Val Ser Val Ala 
185 190 

Leu Asn Leu Thr Asn Lys Gly Asp 
205 



Gly He Ala His Phe 
45 210 

Leu Arg Ser Val He 
225 

50 Gly Arg Ser Thr Pro 

245 

Asn Val Thr Leu Lys 
260 

55 



Lys Gly Ser Gly Phe 
215 

Gin Val Thr Glu Tyr 
230 

He Thr Tyr Leu He 
250 

Asn Ser Asn He Lys 
265 



He Glu Gly Ala Gin Gly 
220 

Pro Leu Asp Asp Asn Lys 
235 240 

Asn Gly Ser Leu Ala Pro 
255 

Phe Gly Ser Gly Gly Gly 
270 
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Ser Gly Gly Gly Ser Gly Gly Gly Ser Ala Ser Met Thr Val Tyr Asn 
275 280 285 

Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly Glu Trp Gly Gly Pro 
5 290 295 300 

Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu Thr Asn Pro Asp His Asp 
305 310 315 320 

10 Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys Ser Thr Pro Glu Arg 

325 330 335 

Ser Thr Tyr Thr Gin Thr lie Lys He Ser Ser Asp Thr Gly Ser Pro 
340 345 350 



15 



50 



lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu Tyr Asp Val Gly 
355 360 365 



Asn Ala Asp Asp He Leu Ala Tyr Pro Ser Gin Lys Val Cys Ser Thr 
20 3 7 0 3 7 5 3 8 0 

Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys Gly Ser Tyr Val 
385 390 395 400 

25 Thr He Lys Tyr Ser Leu Thr Pro Ala 

405 





<210> 


13 




<211> 


2397 


30 


<212> 


DNA 




<213> 


artificial 




<220> 






<221> 


CDS 


35 


<222> 


(1) . . (1197) 




<223> 


ET33-GSGGAS-: 




<400> 


13 




atg aca gta tat aac 


40 


Met Thr Val Tyr Asn 




1 


5 



10 15 



48 



gaa tgg ggg ggg cca gaa cct tac ggt aag ata tat gca tac ctt caa 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr Leu Gin 
45 20 25 30 

aat cca gat cat aat ttc gaa att tgg tea caa gat aat tgg ggg aag 144 
Asn Pro Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 ~ 45 



gat acg cct gag aaa agt tct cac act caa aca att aaa ata agt age 192 
Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr He Lys He Ser Ser 
50 55 60 



55 cca aca ggg ggg cct ata aac caa atg tgt ttt tat ggt gat gta aaa 



240 
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* Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

gaa tac gac gta gga aat gca gat gat gtt etc gec tat cca agt caa 288 

5 Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

aaa gta tgc agt acg cct ggc aca aca ata agg ctt aac gga gat gag 336 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 

10 100 105 110 

aaa ggt tct tat ata cag att aga tat tec ttg gec cca get gga tec 3 84 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala Gly Ser 
115 120 125 

15 

99"t 99^ 9" c t age atg gga att ate aac att caa gac gaa att aat gac 432 

Gly Gly Ala Ser Met Gly He He Asn He Gin Asp Glu He Asn Asp 
130 135 140 

20 tac atg aaa ggt atg tat ggt gca aca tct gtt aaa age act tat gac 480 

Tyr Met Lys Gly Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp 
145 150 155 160 

ccc tea ttc aaa gta ttt aac gaa tct gtg aca cct caa tat gat gtg 528 

25 Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val 

165 170 175 

att cca aca gaa cct gta aat aat cat att act act aaa gta ata gat 576 

He Pro Thr Glu Pro Val Asn Asn His He Thr Thr Lys Val He Asp 

30 180 185 190 

aat cca ggg act tea gaa gta ace agt aca gta acg ttc aca tgg acg 624 
Asn Pro Gly Thr Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr 
195 200 205 

35 

gaa ace gac act gta ace tct gca gtg act aaa ggg tat aaa gtc ggt 672 

Glu Thr Asp Thr Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly 
210 215 220 

40 ggt tea gta age tea aaa gca act ttt aaa ttt get ttt gtt act tct 72 0 

Gly Ser Val Ser Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser 
225 230 235 240 

gat gtt act gta act gta tea gca gaa tat aat tat agt aca aca gaa 768 

45 Asp Val Thr Val Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu 

245 250 255 

aca aca aca aaa aca gat aca cgc aca tgg acg gat teg acg aca gta 816 

Thr Thr Thr Lys Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val 

50 2 6 0 2 6 5 2 7 0 

aaa gec cct cca aga act aat gta gaa gtt gca tat att ate caa act 864 

Lys Ala Pro Pro Arg Thr Asn Val Glu Val Ala Tyr He He Gin Thr 
275 280 285 

55 
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gga aat tat aac gtt ccg gtt aat gta gag tct gat atg act gga acg 912 

Gly Asn Tyr Asn Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr 
290 295 300 

5 eta ttt tgc aga ggg tat aga gat ggt gca eta att gca gcg get tat 960 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr 
305 310 315 320 

gtt tct ata aca gat tta gca gat tac aat cct aat ttg ggt ctt aca 1008 

10 Val Ser lie Thr Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr 

325 330 335 

aat gaa ggg aat ggg gtt get cat ttt aaa ggt gaa ggt tat ata gag 1056 
Asn Glu Gly Asn Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu 

15 340 345 350 

ggt gcg caa ggc tta aga age tac att caa gtt aca gaa tat cca gtg 1104 

Gly Ala Gin Gly Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val 

355 360 365 

20 

gat gat aat ggc aga cat teg ata cca aaa act tat ata att aaa ggt 1152 

Asp Asp Asn Gly Arg His Ser lie Pro Lys Thr Tyr lie lie Lys Gly 
370 375 380 

25 tea tta gca ccc aat gtt act tta ata aat gat aga aag gaa ggt 1197 

Ser Leu Ala Pro Asn Val Thr Leu lie Asn Asp Arg Lys Glu Gly 





385 


390 


395 








30 


agaatgggaa 


ttattaatat 


ccaagatgaa 


attaataatt 


acatgaaaga 


ggtatatggt 


1257 


gcaacaactg 


ttaaaagcac 


atacgatccc 


tcattcaaag 


tatttaatga 


atctgtgaca 


1317 




ccccaattca 


ctgaaattcc 


aacagaacct 


gtaaataatc 


aattaactac 


aaaaagagta 


1377 


35 


gataataegg 


gtagttaccc 


agtagaaagt 


actgtatcgt 


tcacatggac 


ggaaacccat 


1437 




acagaaacaa 


gtgeagtaac 


tgagggagtg 


aaagceggea 


cctcaataag 


tactaaacaa 


1497 


40 


tcttttaaat 


ttggttttgt 


taactctgat 


gttactttaa 


eggtatcage 


agaatataat 


1557 


tatagtacaa 


caaatacaac 


tacaacaaca 


gaaacacaca 


cctggtcaga 


ttcaacaaaa 


1617 




gtaactattc 


ctcccaaaac 


ttatgtggag 


getgeataca 


ttatccaaaa 


tggaacatat 


1677 


45 


aatgttccgg 


ttaatgtaga 


atgtgatatg 


agtggaactt 


tattttgtag 


agggtataga 


1737 




gatggtgcgc 


ttattgeage 


agtttatgtt 


tetgtagegg 


atttagcaga 


ttacaatcca 


1797 


50 


aatttaaatc 


ttacaaataa 


aggggatgga 


attgetcact 


ttaaaggttc 


gggttttata 


1857 


ga-gggtgeae 


aaggcttgcg 


aagcattatt 


caggttacag 


aatatccact 


agatgataat 


1917 




aaaggtcget 


cgacaccaat 


aacttattta 


ataaatggtt 


cattagcacc 


aaatgttaca 


1977 


55 


ttaaaaaata 


gcaacataaa 


atttggatcc 


ggtggagcta 


gcatgacagt 


atataacgea 


2037 
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actttcacca ttaatttcta taatgaagga gaatgggggg ggccagaacc atatggttat 2 097 

ataaaagcat atcttacaaa tccagatcat gattttgaaa tttggaaaca agatgattgg 215 7 

5 

gggaaaagta ctcctgagag aagtacttat acgcaaacga ttaaaataag tagcgacact 2217 

ggttccccta taaaccaaat gtgtttttat ggtgatgtga aagaatacga cgtaggaaat 22 77 

10 gcagatgata ttctcgctta tccaagtcaa aaagtatgca gtacacctgg tgtaacagta 233 7 

cgacttgatg gcgatgagaa aggttcttat gtgacaatta agtattcctt gactccagca 2397 

15 <210> 14 

<211> 399 

<212> PRT 

<213> artificial 

20 <400> 14 

Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
15 10 15 

25 Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
20 25 30 

Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys 
. 35 40 45 

30 

Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
35 65 70 75 80 

Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 
85 90 95 

40 Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala Gly Ser 
115 120 125 

45 

Gly Gly Ala Ser Met Gly lie lie Asn lie Gin Asp Glu He Asn Asp 
130 135 140 

* 

Tyr Met Lys Gly Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp 
50 145 150 155 160 

Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val 
165 170 175 



55 



He Pro Thr Glu Pro Val Asn Asn His He Thr Thr Lys Val He Asp 
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180 185 190 

Asn Pro Gly Thr Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr 
195 200 205 

5 

Glu Thr Asp Thr Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly 
210 215 220 

Gly Ser Val Ser Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser 
10 225 230 235 240 

Asp Val Thr Val Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu 
245 250 255 

15 Thr Thr Thr Lys Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val 
260 265 270 

Lys Ala Pro Pro Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr 
275 280 285 

20 

Gly Asn Tyr Asn Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr 
290 295 300 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr 
25 3 0 5 3 1 0 3 1 5 3 2 0 

Val Ser lie Thr Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr 
325 330 335 

30 Asn Glu Gly Asn Gly Val Ala His Phe Lys Gly Glu Gly Tyr lie Glu 
340 345 350 

Gly Ala Gin Gly Leu Arg Ser Tyr lie Gin Val Thr Glu Tyr Pro Val 
355 360 365 

35 

Asp Asp Asn Gly Arg His Ser lie Pro Lys Thr Tyr He He Lys Gly 
370 375 380 

Ser Leu Ala Pro Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly 
40 385 390 395 

<210> 15 

<211> 1197 

<212> DNA * 

45 <213> artificial 

<220> 

<221> CDS 

<222> (1) . . (1197) 

50 <223> ET33-GSGGAS-ET34-plant 

<400> 15 

at 9" ggt ate ate aac att caa gat gag att aac aat tac atg aag gaa 48 

Met Gly He lie Asn He Gin Asp Glu He Asn Asn Tyr Met Lys Glu 

55 1 5 10 15 
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gtt tac ggt get act act gtt aag tct act tac gat cct tct ttc aag 96 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 

20 25 30 

5 

gtt ttc aat gaa tct gtt act cct caa ttc act gaa att cct act gaa 144 

Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 45 

10 cct gtc aac aac cag ctt act act aag agg gtc gac aat act ggt tct 192 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

tac cct gtt gaa tct act gtt tct ttc act tgg act gaa act cat act 240 

15 Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 

65 70 75 80 

gaa act tct get gtt act gaa ggt gtt aag get ggt act tct att tct 2 88 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 

20 85 90 95 

act aag caa tct ttc aag ttc ggt ttc gtg aac tct gat gtt act ctt 336 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 

100 105 110 

25 

act gtt tct get gag tac aac tac tct act act aac act act act act 384 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 

30 act gaa act cat act tgg tct gat tct act aag gtt act att cct cct 432 

Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

aag act tac gtt gaa get get tac ate ate cag aat ggt act tac aat 48 0 

35 Lys Thr Tyr Val Glu Ala Ala Tyr He He Gin Asn Gly Thr Tyr Asn 

145 150 155 160 

gtt cct gtt aat gtt gaa tgc gat atg tct ggt act ctg ttc tgt cga 528 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

40 165 170 175 

ggt tat cgt gat ggt get ctt att get get gtt tac gtt tct gtt get 576 

Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr Val Ser Val Ala 

180 185 190 

45 

gat ctt get gat tac aat cct aat ctt aat ctt act aat aag ggt gat 624 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

50 ggt att get cat ttc aag ggt tct gga ttc att gaa ggt get caa ggt 672 

Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu Gly Ala Gin Gly 
210 215 220 

ctt aga tct gtg ate caa gtt act gaa tac cct ctt gat gat aat aag 72 0 

55 Leu Arg Ser Val lie Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
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225 230 235 240 

ggt agg tct act cct att acg tac ctt ate aac ggt tct ctt get cct 768 
Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly Ser Leu Ala Pro 
245 250 255 

aat gtt act ctt aag aat tct aat att aag ttc gga tec ggt gga get 816 
Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe Gly Ser Gly Gly Ala 
260 265 270 

age atg act gtg tac aat get act ttc act ate aac ttt tac aat gaa 864 
Ser Met Thr Val Tyr Asn Ala Thr Phe Thr He Asn Phe Tyr Asn Glu 
275 280 285 

15 ggt gaa tgg ggt ggt cct gaa cct tac ggt tac ate aag gca tac ctt 912 
Gly Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu 
290 295 300 

act aat cct gat cat gat ttc gag att tgg aag caa gat gat tgg ggt 96 0 

20 Thr Asn Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly 
305 310 315 320 

aag tct act cct gag agg tct act tac act caa act att aag ata tct 1008 
Lys Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser 
25 325 330 335 

tct gat act ggt tct cct ate aac cag atg tgc ttc tac ggt gac gtc 1056 
Ser Asp Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val 
340 345 350 

30 

aag gaa tac gat gtc ggc aac get gat gat att ctt get tac cct tct 1104 
Lys Glu Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser 
355 360 365 

35 caa aag gtt tgc tct act cct ggt gtt act gtt agg ctt gat ggt gat 1152 
Gin Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp 
370 375 380 

gag aag ggt tct tac gtt act att aag tac tct ctt act cet get 1197 
40 Glu Lys Gly Ser Tyr Val Thr He Lys Tyr Ser Leu Thr Pro Ala 
385 390 395 



<210> 16 

45 <211> 399 

<212> PRT 

<213> artificial 



50 



<400> 16 

Met Gly He lie Asn He Gin Asp Glu He Asn Asn Tyr Met Lys Glu 
15 10 15 



Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
55 20 25 30 
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Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 45 

5 Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 



Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 

10 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 
85 90 95 



Thr Lys Gin Ser Phe 
15 100 

Thr Val Ser Ala Glu 
115 

20 Thr Glu Thr His Thr 
130 

Lys Thr Tyr Val Glu 
145 

25 

Val Pro Val Asn Val 
165 



Lys Phe Gly Phe Val 
105 

Tyr Asn Tyr Ser Thr 
120 

Trp Ser Asp Ser Thr 
135 

Ala Ala Tyr lie lie 
150 

Glu Cys Asp Met Ser 
170 



Asn Ser Asp Val Thr Leu 
110 

Thr Asn Thr Thr Thr Thr 
125 

Lys Val Thr lie Pro Pro 
140 

Gin Asn Gly Thr Tyr Asn 
155 160 

Gly Thr Leu Phe Cys Arg 
175 



Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 
30 180 185 190 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

35 Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu Gly Ala Gin Gly 
210 215 220 



Leu Arg Ser Val lie 
225 

40 

Gly Arg Ser Thr Pro 
245 



Gin Val Thr Glu Tyr 
230 

lie Thr Tyr Leu lie 
250 



Pro Leu Asp Asp Asn Lys 
235 240 

Asn Gly Ser Leu Ala Pro 
255 



Asn Val Thr Leu Lys Asn Ser Asn 
45 2 6 0 

Ser Met Thr Val Tyr Asn Ala Thr 
275 280 

50 Gly Glu Trp Gly Gly Pro Glu Pro 
290 295 

Thr Asn Pro Asp His Asp Phe Glu 
305 310 

55 



lie Lys Phe Gly Ser Gly Gly Ala 
265 270 

Phe Thr lie Asn Phe Tyr Asn Glu 
285 

Tyr Gly Tyr lie Lys Ala Tyr Leu 
300 

He Trp Lys Gin Asp Asp Trp Gly 
315 320 
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Lys Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys lie Ser 
325 330 335 

Ser Asp Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val 
5 340 345 350 

Lys Glu Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser 
355 360 365 

10 Gin Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp 
370 375 380 

Glu Lys Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala 

385 390 395 

15 

<210> 17 

<211> 1239 

<212> DNA 

<213> artificial 

<220> 

<221> CDS 
<222> (1) . . (1239) 

<223> ET33-LO linker-ET34 -plant 
<400> 17 

atg ggt ate ate aac att caa gat gag att aac aat tac atg aag gaa 48 
Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 
15 10 15 

gtt tac ggt get act act gtt aag tct act tac gat cct tct ttc aag 96 
Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

35 gtt ttc aat gaa tct gtt act cct caa ttc act gaa att cct act gaa 144 
Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 45 

cct gtc aac aac cag ctt act act aag agg gtc gac aat act ggt tct 192 
40 Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

tac cct gtt gaa tct act gtt tct ttc act tgg act gaa act cat act 240 
Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
45 65 70 75 80 

gaa act tct get gtt act gaa ggt gtt aag get ggt act tct att tct 288 
Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser lie Ser 
85 90 95 



20 



25 



30 



50 



act aag caa tct ttc aag ttc ggt ttc gtg aac tct gat gtt act ctt 336 
Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 



55 act gtt tct get gag tac aac tac tct act act aac act act act act 



384 



WO 02/22662 



PCT/USO 1/28746 



-28- 



Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 

act gaa act cat act tgg tct gat tct act aag gtt act att cct cct 432 
5 Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

aag act tac gtt gaa get get tac ate ate cag aat ggt act tac aat 480 
Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn Gly Thr Tyr Asn 
10 145 150 155 160 

gtt cct gtt aat gtt gaa tgc gat atg tct ggt act ctg ttc tgt cga 528 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 

165 170 175 

15 

ggt tat cgt gat ggt get ctt att get get gtt tac gtt tct gtt get 576 

Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr Val Ser Val Ala 

180 185 190 

20 gat ctt get gat tac aat cct aat ctt aat ctt act aat aag ggt gat 624 
Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

ggt att get cat ttc aag ggt tct gga ttc att gaa ggt get caa ggt 6 72 

25 Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu Gly Ala Gin Gly 
210 215 220 

ctt aga tct gtg ate caa gtt act gaa tac cct ctt gat gat aat aag 72 0 

Leu Arg Ser Val lie Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
30 225 230 235 240 

ggt agg tct act cct att acg tac ctt ate aac ggt tct ctt get cct 768 
Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly Ser Leu Ala Pro 
245 250 255 

35 

aat gtt act ctt aag aat tct aat att aag ttc gga tec cca get ttg 816 
Asn Val Thr Leu Lys Asn Ser Asn lie Lys Phe Gly Ser Pro Ala Leu 
260 265 270 

40 ctt aag gag get cca aga get gag gaa gag ttg cca cca get age atg 864 
Leu Lys Glu Ala Pro Arg Ala Glu Glu Glu Leu Pro Pro Ala Ser Met 
275 280 285 

act gtg tac aat get act ttc act ate aac ttt tac aat gaa ggt gaa 912 
45 Thr Val Tyr Asn Ala Thr Phe Thr He Asn Phe Tyr Asn Glu Gly Glu 
290 295 300 

tgg ggt ggt cct gaa cct tac ggt tac ate aag gca tac ctt act aat 96 0 

Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu Thr Asn 
50 305 310 315 320 

cct gat cat gat ttc gag att tgg aag caa gat gat tgg ggt aag tct 1008 

Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys Ser 
325 330 335 

55 
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act cct gag agg tct act tac act caa act att aag ata tct tct gat 1056 
Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser Asp 
340 345 350 

5 act ggt tct cct ate aac cag atg tgc ttc tac ggt gac gtc aag gaa 1104 
Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu 
355 360 365 

tac gat gtc ggc aac get gat gat att ctt get tac cct tct caa aag 1152 
10 Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin Lys 
370 375 380 

gtt tgc tct act cct ggt gtt act gtt agg ctt gat ggt gat gag aag 12 00 

Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys 
15 385 390 395 400 

ggt tct tac gtt act att aag tac tct ctt act cct get 12 3 9 

Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala 
405 410 

20 

<210> 18 
<211> 413 
<212> PRT 
25 <213> artificial 

<400> 18 

Met Gly He He Asn lie Gin Asp Glu He Asn Asn Tyr Met Lys Glu 
30 1 5 10 15 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

35 Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu He Pro Thr Glu 
35 40 45 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 



40 



Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 



Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser He Ser 
45 85 90 95 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 

50 Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
115 120 125 

Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr He Pro Pro 
130 135 140 

55 
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Lys Thr Tyr Val Glu Ala Ala Tyr He He Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 
5 165 170 175 

Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 

10 Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu Gly Ala Gin Gly 
210 215 220 

15 

Leu Arg Ser Val He Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
225 230 235 240 

Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly Ser Leu Ala Pro 
20 245 250 255 

Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe Gly Ser Pro Ala Leu 
260 265 270 

25 Leu Lys Glu Ala Pro Arg Ala Glu Glu Glu Leu Pro Pro Ala Ser Met 
275 280 285 

Thr Val Tyr Asn Ala Thr Phe Thr He Asn Phe Tyr Asn Glu Gly Glu 
290 295 300 

30 

Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu Thr Asn 
305 310 315 320 

Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys Ser 
35 325 330 335 

Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser Ser Asp 
340 345 350 

40 Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys Glu 
355 360 365 

Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser Gin Lys 
370 375 380 

45 

Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu Lys 
385 390 395 400 

Gly Ser Tyr Val Thr He Lys Tyr Ser Leu Thr Pro Ala 
50 405 410 

<210> 19 

<2H> 1197 

<212> DNA 

55 <213> artificial 
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<220> 

<221> CDS 

<222> (1) . . (1197) 

5 <223> ET3 4 - GSGGAS - ET3 3 

<400> 19 

atg aca gta tat aac gca act ttc acc att aat ttc tat aat gaa gga 48 

Met Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly 
10 1 5 10 15 

gaa tgg ggg ggg cca gaa cca tat ggt tat ata aaa gca tat ctt aca 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr 
20 25 30 

15 

aat cca gat cat gat ttt gaa att tgg" aaa caa gat gat tgg ggg aaa 144 
Asn Pro Asp His Asp Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys 
35 40 45 

20 agt act cct gag aga agt act tat acg caa acg att aaa ata agt age 192 
Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

gac act ggt tec cct ata aac caa atg tgt ttt tat ggt gat gtg aaa 240 
25 Asp Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

gaa tac gac gta gga aat gca gat gat att etc get tat cca agt caa 2 88 

Glu Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin 
30 85 90 95 

aaa gta tgc agt aca cct ggt gta aca gta cga ctt gat ggc gat gag 336 
Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
100 105 110 

35 

aaa ggt tct tat gtg aca att aag tat tec ttg act cea gca gga tec 3 84 

Lys Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala Gly Ser 
115 120' 125 

40 ggt gga get age atg gga att att aat ate caa gat gaa att aat aat 432 
Gly Gly Ala Ser Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn 
130 135 140 

tac atg aaa gag gta tat ggt gca aca act gtt aaa age aca tac gat 48 0 

45 Tyr Met Lys Glu Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp 
145 150 155 160 

ccc tea ttc aaa gta ttt aat gaa tct gtg aca ccc caa ttc act gaa 52 8 

Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu 
50 165 170 175 

att cca aca gaa cct gta aat aat caa tta act aca aaa aga gta gat 576 
lie Pro Thr Glu Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp 
180 185 190 

55 
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aat acg ggt agt tac cca gta gaa agt act gta teg ttc aca tgg acg 624 
Asn Thr Gly Ser Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr 
195 200 205 

5 gaa ace cat aca gaa aca agt gca gta act gag gga gtg aaa gec ggc 672 
Glu Thr His Thr Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly 
210 215 220 

acc tea ata agt act aaa caa tct ttt aaa ttt ggt ttt gtt aac tct 720 
10 Thr Ser lie Ser Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser 
225 230 235 240 

gat gtt act tta acg gta tea gca gaa tat aat tat agt aca aca aat 768 
Asp Val Thr Leu Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn 
15 245 250 255 

aca act aca aca aca gaa aca cac acc tgg tea gat tea aca aaa gta 816 
Thr Thr Thr Thr Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val 
260 265 270 

20 

act att cct ccc aaa act tat gtg gag get gca tac att ate caa aat 864 
Thr lie Pro Pro Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn 
275 280 285 

25 gga aca tat aat gtt ccg gtt aat gta gaa tgt gat atg agt gga act 912 
Gly Thr Tyr Asn Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr 
290 295 300 

tta ttt tgt aga ggg tat aga gat ggt gcg ctt att gca gca gtt tat 960 
30 Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr 
305 310 315 320 

gtt tct gta gcg gat tta gca gat tac aat cea aat tta aat ctt aca 1008 
Val Ser Val Ala Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr 
35 3 2 5 3 3 0 3 3 5 

aat aaa ggg gat gga att get cac ttt aaa ggt teg ggt. ttt ata gag 1056 
Asn Lys Gly Asp Gly lie Ala His Phe Lys Gly Ser Gly Phe He Glu 
340 345 350 

40 

ggt gca caa ggc ttg cga age att att cag gtt aca gaa tat cca eta 1104 
Gly Ala Gin Gly Leu Arg Ser He He Gin Val Thr Glu Tyr Pro Leu 
355 360 365 

45 gat gat aat aaa ggt cgc teg aca cca ata act tat tta ata aat ggt 1152 
Asp Asp Asn Lys Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly 
370 375 380 

tea tta gca cca aat gtt aca tta aaa aat age aac ata aaa ttt 1197 
50 Ser Leu Ala Pro Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe 
385 390 395 



<210> 20 
55 <211> 399 
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<212> PRT 

<213> artificial 

<400> 20 

5 

Met Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly 
15 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr 
10 20 25 30 

Asn Pro Asp His Asp Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys 
35 40 45 

15 Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys He Ser Ser 
50 55 60 

Asp Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

20 

Glu Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser Gin 
85 90 95 

Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
25 100 105 110 

Lys Gly Ser Tyr Val Thr He Lys Tyr Ser Leu Thr Pro Ala Gly Ser 
115 120 125 

30 Gly Gly Ala Ser Met Gly He He Asn He Gin Asp Glu He Asn Asn 
130 135 140 

Tyr Met Lys Glu Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp 
145 150 155 160 

35 

Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu 
165 170 175 

He Pro Thr Glu Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp 
40 180 185 190 

Asn Thr Gly Ser Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr 
195 200 205 

45 Glu Thr His Thr Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly 
210 215 220 

Thr Ser He Ser Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser 
225 230 235 240 

50 

Asp Val Thr Leu Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn 
245 250 255 

Thr Thr Thr Thr Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val 
55 260 265 270 



WO 02/22662 



PCT/US01/28746 



-34- 



10 



30 



35 



40 



Thr lie Pro Pro Lys Thr Tyr Val Glu Ala Ala Tyr lie lie Gin Asn 
275 280 285 

Gly Thr Tyr Asn Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr 
290 295 300 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Val Tyr 
305 310 315 320 

Val Ser Val Ala Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr 
325 330 335 



Asn Lys Gly Asp Gly lie Ala His Phe Lys Gly Ser Gly Phe lie Glu 
15 340 345 350 

Gly Ala Gin Gly Leu Arg Ser lie lie Gin Val Thr Glu Tyr Pro Leu 
355 360 365 

20 Asp Asp Asn Lys Gly Arg Ser Thr Pro lie Thr Tyr Leu lie Asn Gly 
370 375 380 

Ser Leu Ala Pro Asn Val Thr Leu Lys Asn Ser Asn lie Lys Phe 
385 390 395 

25 

<210> 21 

<211> 1197 

<212> DNA 

<213> artificial 

<220> 

<221> CDS 

<222> (1) . . (1197) 

<223> ET34 -GSGGAS -ET3 3 -plant 

<400> 21 

atg act gtg tac aat get act ttc act ate aac ttt tac aat gaa ggt 48 
Met Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly 
15 10 15 

gaa tgg ggt ggt cct gaa cct tac ggt tac ate aag gca tac ctt act 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr 
20 25 30 

45 aat cct gat cat gat ttc gag att tgg aag caa gat gat tgg ggt aag 144 
Asn Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys 
35 40 45 

tct act cct gag agg tct act tac act caa act att aag ata tct tct 192 
50 Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser Ser 
50 55 60 

gat act ggt tct cct ate aac cag atg tgc ttc tac ggt gac gtc aag 240 
Asp Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
55 6 5 7 0 7 5 8 0 
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gaa tac gat gtc ggc aac get gat gat att ctt get tac cct tct caa 288 
Glu Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser Gin 
85 90 95 

5 

aag gtt tgc tct act cct ggt gtt act gtt agg ctt gat ggt gat gag .336 
Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
100 105 110 

», 

10 aag ggt tct tac gtt act att aag tac tct ctt act cct get gga tec 3 84 

Lys Gly Ser Tyr Val Thr He Lys Tyr Ser Leu Thr Pro Ala Gly Ser 
115 120 125 

OTt 9"9"a- get age atg ggt ate ate aac att caa gat gag att aac aat 432 
15 Gly Gly Ala Ser Met Gly He lie Asn He Gin Asp Glu He Asn Asn 
130 135 140 

tac atg aag gaa gtt tac ggt get act act gtt aag tct act tac gat 480 
Tyr Met Lys Glu Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp 
20 145 1 50 1 55 1 60 

cct tct ttc aag gtt ttc aat gaa tct gtt act cct caa ttc act gaa 528 
Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu 
165 170 175 

25 

att cct act gaa cct gtc aac aac cag ctt act act aag agg gtc gac 576 
He Pro Thr Glu Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp 
180 185 190 

30 aat act ggt tct tac cct gtt gaa tct act gtt tct tta act tgg act 624 
Asn Thr Gly Ser Tyr Pro Val Glu Ser Thr Val Ser Leu Thr Trp Thr 
195 200 205 

gaa act cat act gaa act tct get gtt act gaa ggt gtt aag get ggt 6 72 

35 Glu Thr His Thr Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly 
210 215 220 

act tct att tct act aag caa tct ttc aag ttc ggt ttc gtg aac tct 720 
Thr Ser He Ser Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser 
40 225 230 235 240 

gat gtt act ctt act gtt tct get gag tac aac tac tct act act aac 768 

Asp Val Thr Leu Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn 

245 250 255 

45 

act act act act act gaa act cat act tgg tct gat tct act aag gtt 816 

Thr Thr Thr Thr Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val 

260 265 270 

50 act att cct cct aag act tac gtt gaa get get tac ate ate cag aat 864 
Thr He Pro Pro Lys Thr Tyr Val Glu Ala Ala Tyr He He Gin Asn 
275 280 285 

ggt act tac aat gtt cct gtt aat gtt gaa tgc gat atg tct ggt act 912 
55 Gly Thr Tyr Asn Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr 
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290 295 300 

ctg ttc tgt cga ggt tat cgt gat ggt get ctt att get get gtt tac 960 
Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr 
5 305 310 - 315 320 

gtt tct gtt get gat ctt get gat tac aat cct aat ctt aat ctt act 1008 
Val Ser Val Ala Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr 
325 330 335 

10 

aat aag ggt gat ggt att get cat ttc aag ggt tct gga ttc att gaa 1056 
Asn Lys Gly Asp Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu 
340 345 350 

15 gg fc 9 c t caa ggt ctt aga tct gtg ate caa gtt act gaa tac cct ctt 1104 
Gly Ala Gin Gly Leu Arg Ser Val He Gin Val Thr Glu Tyr Pro Leu 
355 360 365 

gat gat aat aag ggt agg tct act cct att acg tac ctt ate aac ggt 1152 
20 Asp Asp Asn Lys Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly 
370 375 380 

tct ctt get cct aat gtt act ctt aag aat tct aat att aag ttc 1197 
Ser Leu Ala Pro Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe 
25 385 390 395 

<210> 22 

<211> 399 

30 <212> PRT 

<213> artificial 

<400> 22 

35 Met Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly 
15 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu Thr 
20 25 30 

Asn Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys 
3 5 .40 45 



40 



Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser Ser 
45 50 55 60 

Asp Thr Gly Ser Pro He Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

50 Glu Tyr Asp Val Gly Asn Ala Asp Asp He Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
100 105 110 

55 
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Lys Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala Gly Ser 
115 120 125 

Gly Gly Ala Ser Met Gly He He Asn He Gin Asp Glu He Asn Asn 
5 130 135 140 

Tyr Met Lys Glu Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp 
145 150 155 160 

10 Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu 

165 170 175 

He Pro Thr Glu Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp 
180 185 190 

15 

Asn Thr Gly Ser Tyr Pro Val Glu Ser Thr Val Ser Leu Thr Trp Thr 
195 200 205 

Glu Thr His Thr Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly 
20 210 215 220 

Thr Ser He Ser Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser 
225 230 235 240 

25 Asp Val Thr Leu Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn 

245 250 255 

Thr Thr Thr Thr Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val 
260 265 270 

30 

Thr He Pro Pro Lys Thr Tyr Val Glu Ala Ala Tyr He He Gin Asn 
275 280 285 

Gly Thr Tyr Asn Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr 
35 2 9 0 2 9 5 3 0 0 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr 
305 310 315 320 

40 Val Ser Val Ala Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr 

325 330 335 

Asn Lys Gly Asp Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu 
340 345 350 

45 

Gly Ala Gin Gly Leu Arg Ser Val He Gin Val Thr Glu Tyr Pro Leu 
355 360 365 

Asp Asp Asn Lys Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly 
50 370 375 380 



Ser Leu Ala Pro Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe 
385 390 395 
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<210> 23 

<211> 801 

<212> DNA 

<213> Bacillus thuringiensis 
<220> 

<221> CDS 

<222> (1) . . (801) 

<223> ET33 

<400> 23 

at 9" a-tt att aat ate caa gat gaa att aat aat tac atg aaa gag 48 

Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asn Tyr Met Lys Glu 

15 10 15 

gta tat ggt gca aca act gtt aaa age aca tac gat ccc tea ttc aaa 96 
Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

20 gta ttt aat gaa tct gtg aca ccc caa ttc act gaa att cca aca gaa 144 
Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu lie Pro Thr Glu 
35 40 45 

cct gta aat aat caa tta act aca aaa aga gta gat aat acg ggt agt 192 
25 Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

tac cca gta gaa agt act gta teg ttc aca tgg acg gaa ace cat aca 240 
Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
30 65 70 75 80 

gaa aca agt gca gta act gag gga gtg aaa gee ggc acc tea ata agt 2 88 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser He Ser 

85 90 95 

35 

act aaa caa tct ttt aaa ttt ggt ttt gtt aac tct gat gtt act tta 336 

Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 

100 105 110 

40 acg gta tea gca gaa tat aat tat agt aca aca aat aca act aca aca 3 84 

Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr , Thr 
115 120 125 

aca gaa aca cac acc tgg tea gat tea aca aaa gta act att cct ccc 432 
45 Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr He Pro Pro 
130 135 140 

aaa act tat gtg gag get gca tac att ate caa aat gga aca tat aat 48 0 

Lys Thr Tyr Val Glu Ala Ala Tyr He He Gin Asn Gly Thr Tyr Asn 
50 145 1 50 1 55 1 60 

gtt ccg gtt aat gta gaa tgt gat atg agt gga act tta ttt tgt aga 52 8 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 
165 170 175 

55 
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ggg tat aga gat ggt gcg ctt att gca gca gtt tat gtt tct gta gcg 576 

Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 

5 gat tta gca gat tac aat cca aat tta aat ctt aca aat aaa ggg gat 624 

Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
195 200 205 

gga att get cac ttt aaa ggt teg ggt ttt ata gag ggt gca caa ggc 672 

10 Gly He Ala His Phe Lys Gly Ser Gly Phe He Glu Gly Ala Gin Gly 

210 215 220 

ttg cga age att att cag gtt aca gaa tat cca eta gat gat aat aaa 720 

Leu Arg Ser He He Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 

15 225 230 235 240 

ggt cgc teg aca cca ata act tat tta ata aat ggt tea tta gca cca 768 

Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly Ser Leu Ala Pro 

245 250 255 



20 



25 



30 



35 



50 



aat gtt aca tta aaa aat age aac ata aaa ttt 8 01 

Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe 
260 265 



<210> 24 

<211> 267 

<212> PRT 

<213> Bacillus thuringiensis 



<400> 24 

Met Gly He He Asn He Gin Asp Glu He Asn Asn Tyr Met Lys Glu 
15 10 15 

Val Tyr Gly Ala Thr Thr Val Lys Ser Thr Tyr Asp Pro Ser Phe Lys 
20 25 30 

Val Phe Asn Glu Ser Val Thr Pro Gin Phe Thr Glu He Pro Thr Glu 
40 35 40 45 

Pro Val Asn Asn Gin Leu Thr Thr Lys Arg Val Asp Asn Thr Gly Ser 
50 55 60 

45 Tyr Pro Val Glu Ser Thr Val Ser Phe Thr Trp Thr Glu Thr His Thr 
65 70 75 80 

Glu Thr Ser Ala Val Thr Glu Gly Val Lys Ala Gly Thr Ser He Ser 
85 90 95 



Thr Lys Gin Ser Phe Lys Phe Gly Phe Val Asn Ser Asp Val Thr Leu 
100 105 110 



Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Asn Thr Thr Thr Thr 
55 115 ^ 120 125 
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10 



25 



Thr Glu Thr His Thr Trp Ser Asp Ser Thr Lys Val Thr lie Pro Pro 
130 135 140 

Lys Thr Tyr Val Glu Ala Ala Tyr lie He Gin Asn Gly Thr Tyr Asn 
145 150 155 160 

Val Pro Val Asn Val Glu Cys Asp Met Ser Gly Thr Leu Phe Cys Arg 
165 170 175 

Gly Tyr Arg Asp Gly Ala Leu He Ala Ala Val Tyr Val Ser Val Ala 
180 185 190 



Asp Leu Ala Asp Tyr Asn Pro Asn Leu Asn Leu Thr Asn Lys Gly Asp 
15 195 200 205 

Gly lie Ala His Phe Lys Gly Ser Gly Phe He Glu Gly Ala Gin Gly 
210 215 220 

20 Leu Arg Ser He He Gin Val Thr Glu Tyr Pro Leu Asp Asp Asn Lys 
225 230 235 240 

Gly Arg Ser Thr Pro He Thr Tyr Leu He Asn Gly Ser Leu Ala Pro 
245 250 255 



Asn Val Thr Leu Lys Asn Ser Asn He Lys Phe 
260 265 





<210> 


25 






30 


<211> 


381 








<212> 


DNA 








<213> 


Bacillus thuringiensis 








<220> 








35 


<221> 


CDS 








<222> 


(1) . . (381) 








<223> 


ET34 








<400> 


25 






40 


atg aca gta tat aac gca act ttc 


acc 


att 




Met Thr Val Tyr Asn Ala Thr Phe 


Thr 


He 




1 


5 




10 



15 



48 



gaa tgg ggg ggg cca gaa cca tat ggt tat ata aaa gca tat ctt aca 96 
45 Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr He Lys Ala Tyr Leu Thr 
20 25 30 

aat cca gat cat gat ttt gaa att tgg aaa caa gat gat tgg ggg aaa 144 
Asn Pro Asp His Asp Phe Glu He Trp Lys Gin Asp Asp Trp Gly Lys 
50 35 40 45 

agt act cct gag aga agt act tat acg caa acg att aaa ata agt age 192 

Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr He Lys He Ser Ser 
50 55 60 

55 
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gac act ggt tec cct ata aac caa atg tgt ttt tat ggt gat gtg aaa 240 
Asp Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

5 gaa tac gac gta gga aat gca gat gat att etc get tat cca agt caa 288 
Glu Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin 
85 90 95 

aaa gta tgc agt aca cct ggt gta aca gta cga ctt gat ggc gat gag 336 
10 Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
100 105 110 

aaa ggt tct tat gtg aca att aag tat tec ttg act cca gca taa 381 
Lys Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala 
15 X15 120 125 



<210> 26 

<211> 126 

20 <212> PRT 

<213> Bacillus thuringiensis 

<400> 26 

25 Met Thr Val Tyr Asn Ala Thr Phe Thr lie Asn Phe Tyr Asn Glu Gly 
15 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Tyr lie Lys Ala Tyr Leu Thr 
20 25 30 

30 

Asn Pro Asp His Asp Phe Glu lie Trp Lys Gin Asp Asp Trp Gly Lys 
35 40 45 

Ser Thr Pro Glu Arg Ser Thr Tyr Thr Gin Thr lie Lys lie Ser Ser 
35 50 55 60 

Asp Thr Gly Ser Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

40 Glu Tyr Asp Val Gly Asn Ala Asp Asp lie Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Val Thr Val Arg Leu Asp Gly Asp Glu 
100 105 110 

45 

Lys Gly Ser Tyr Val Thr lie Lys Tyr Ser Leu Thr Pro Ala 
115 120 125 

<210> 27 

50 <211> 805 

<212> DNA 

<213> Bacillus thuringiensis 



<220> 
55 <221> DNA 
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<222> (1) . . (805) 

<223> Cryptic tIClOO, frameshift at position 84 
<400> 27 



5 


atgggaatta 


tcaacattca 


agacgaaatt 


aatgactaca 


tgaaaggtat 


gtatggtgca 


60 




acatctgtta 


aaagcactta 


tgaccccctc 


attcaaagta 


tttaacgaat 


ctgtgacacc 


12 0 


10 


tcaatatgat 


gtgattccaa 


cagaacctgt 


aaataatcat 


attactacta 


aagtaataga 


180 


taatccaggg 


acttcagaag 


taaccagtac 


agtaacgttc 


acatggacgg 


aaaccgacac 


240 




tgtaacctct 


gcagtgacta 


aagggtataa 


agtcggtggt 


tcagtaagct 


caaaagcaac 


300 


15 


ttttaaattt 


gcttttgtta 


cttctgatgt 


tactgtaact 


gtatcagcag 


aatataatta 


360 




tagtacaaca 


gaaacaacaa 


caaaaacaga 


tacacgcaca 


tggacggatt 


cgacgacagt 


420 


20 


aaaagcccct 


ccaagaacta 


atgtagaagt 


tgcatatatt 


atccaaactg 


gaaattataa 


480 


cgttccggtt 


aatgtagagt 


ctgatatgac 


tggaacgcta 


ttttgcagag 


ggtatagaga 


540 




tggtgcacta 


attgcagcgg 


cttatgtttc 


tataacagat 


ttagcagatt 


acaatcctaa 


600 


25 


tttgggtctt 


acaaatgaag 


ggaatggggt 


tgctcatttt 


aaaggtgaag 


gttatataga 


660 




gggtgcgcaa 


ggcttaagaa 


gctacattca 


agttacagaa 


tatccagtgg 


atgataatgg 


720 


30 


cagacattcg 


ataccaaaaa 


cttatataat 


taaaggttca 


ttagcaccca 


atgttacttt 


780 


aataaatgat 


agaaaggaag 


gtaga 








805 



<210> 28 

35 <211> 33 

<212> DNA 

<213> artificial 

<220> 

40 <221> DNA 

<222> (1) . . (33) 

<223> Mutagenesis primer for tIClOO - reverse sequence 

<400> 28 

45 cgttaaatac tttgaatgag gggtcataag tgc 33 

<210> 29 

<211> 33 

50 <212> DNA 

<213> artificial 



<220> 
<221> 
55 <222> 



DNA 

(1) • . (33) 
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<223> Mutagenesis primer for tIClOO - forward sequence 

<400> 29 

gcacttatga cccctcattc aaagtattta acg 33 

5 

<210> 30 

<211> 21 

<212> DNA 

10 <213> artificial 

<220> 

<221> DNA 

<222> (1) . . (21) 

15 <223> Pst oligo 

<400> 30 

aaaatgagca accccattcc c 21 

20 

<210> 31 

<211> 21 

<212> DNA 

<213> artificial 

<220> 

<221> DNA 

<222> (1) . . (21) 

<223> EcoRI oligo 

<400> 31 

attattttga attcttttat c 21 

35 <210> 32 

<211> 1200 

<212> DNA 

<213> artificial 

40 <220> 

<221> CDS 

<222> (1) . . (1200) 



25 



30 



45 



50 



<223> tIClOl-GSGGAS-tlClOO fusion peptide 



<400> 32 

atg aca gta tat aac gta act ttt acc att aaa ttc tat aat gaa ggt 4 8 

Met Thr Val Tyr Asn Val Thr Phe Thr lie Lys Phe Tyr Asn Glu Gly 
15 10 15 

gaa tgg ggg ggg cca gaa cct tac ggt aag ata tat gca tac ctt caa 96 
Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys lie Tyr Ala Tyr Leu Gin 
20 25 30 



55 aat cca gat cat aat ttc gaa att tgg tea caa gat aat tgg ggg aag 144 
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Asn Pro Asp His Asn Phe Glu lie Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 

gat acg cct gag aaa agt tct cac act caa aca att aaa ata agt age 192 
5 Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr lie Lys lie Ser Ser 
50 55 60 

cca aca ggg ggg cct ata aac caa atg tgt ttt tat ggt gat gta aaa 24 0 

Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
10 65 70 75 80 

gaa tac gac gta gga aat gca gat gat gtt etc gec tat cca agt caa 288 
Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 
85 90 95 

15 

aaa gta tgc agt acg cct ggc aca aca ata agg ctt aac gga gat gag 3 36 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

20 aaa ggt tct tat ata cag att aga tat tec ttg gec cca get gga tec 3 84 

Lys Gly Ser Tyr lie Gin lie Arg Tyr Ser Leu Ala Pro Ala Gly Ser 
115 120 125 

ggt gga get age atg gga att ate aac att caa gac gaa att aat gac 432 
25 Gly Gly Ala Ser Met Gly lie lie Asn lie Gin Asp Glu lie Asn Asp 
130 135 140 

tac atg aaa ggt atg tat ggt gca aca tct gtt aaa age act tat gac 480 
Tyr Met Lys Gly Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp 
30 145 150 155 160 

ccc tea ttc aaa gta ttt aac gaa tct gtg aca cct caa tat gat gtg 52 8 

Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val 
165 170 175 

35 

att cca aca gaa cct gta aat aat cat att act act aaa gta ata gat 576 
lie Pro Thr Glu Pro Val Asn Asn His lie Thr Thr Lys Val lie Asp 
180 185 190 

40 aat cca ggg act tea gaa gta acc agt aca gta acg ttc aca tgg acg 624 
Asn Pro Gly Thr Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr 
195 200 205 

gaa acc gac act gta acc tct gca gtg act aaa ggg tat aaa gtc ggt 672 
45 Glu Thr Asp Thr Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly 
210 215 220 

ggt tea gta age tea aaa gca act ttt aaa ttt get ttt gtt act tct 720 
Gly Ser Val Ser Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser 
50 225 230 235 240 

gat gtt act gta act gta tea gca gaa tat aat tat agt aca aca gaa 768 
Asp Val Thr Val Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu 
245 250 255 

55 
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aca aca aca aaa aca gat aca cgc aca tgg acg gat teg acg aca gta 816 
Thr Thr Thr Lys Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val 
260 265 270 

5 aaa gec cct cca aga act aat gta gaa gtt gca tat att ate caa act 864 
Lys Ala Pro Pro Arg Thr Asn Val Glu Val Ala Tyr lie lie Gin Thr 
275 280 285 

gga aat tat aac gtt ccg gtt aat gta gag tct gat atg act gga acg 912 
10 Gly Asn Tyr Asn Val Pro Val Asn Val Glu Ser Asp Met Thr Gly Thr 
290 295 300 

eta ttt tgc aga ggg tat aga gat ggt gca eta att gca gcg get tat 96 0 

Leu Phe Cys Arg Gly Tyr Arg Asp Gly Ala Leu lie Ala Ala Ala Tyr 
15 305 310 315 320 

gtt tct ata aca gat tta gca gat tac aat cct aat ttg ggt ctt aca 1008 

Val Ser lie Thr Asp Leu Ala Asp Tyr Asn Pro Asn Leu Gly Leu Thr 

325 330 335 

20 

aat gaa ggg aat ggg gtt get cat ttt aaa ggt gaa ggt tat ata gag 1056 

Asn Glu Gly Asn Gly Val Ala His Phe Lys Gly Glu Gly Tyr He Glu 

340 345 350 

25 ggt gcg caa ggc tta aga age tac att caa gtt aca gaa tat cca gtg 1104 
Gly Ala Gin Gly Leu Arg Ser Tyr He Gin Val Thr Glu Tyr Pro Val 
355 360 365 

gat gat aat ggc aga cat teg ata cca aaa act tat ata att aaa ggt 1152 
30 Asp Asp Asn Gly Arg His Ser lie Pro Lys Thr Tyr He He Lys Gly 
370 375 380 

tea tta gca ccc aat gtt act tta ata aat gat aga aag gaa ggt aga 12 0 0 

Ser Leu Ala Pro Asn Val Thr Leu He Asn Asp Arg Lys Glu Gly Arg 
35 385 390 395 400 



<210> 33 

<211> 400 

40 <212> PRT 

<213> artificial 

<400> 33 

45 Met Thr Val Tyr Asn Val Thr Phe Thr He Lys Phe Tyr Asn Glu Gly 
1. 5 10 15 

Glu Trp Gly Gly Pro Glu Pro Tyr Gly Lys He Tyr Ala Tyr Leu Gin 
20 25 30 

50 

Asn Pro Asp His Asn Phe Glu He Trp Ser Gin Asp Asn Trp Gly Lys 
35 40 45 



Asp Thr Pro Glu Lys Ser Ser His Thr Gin Thr He Lys He Ser Ser 
55 50 55 60 
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Pro Thr Gly Gly Pro lie Asn Gin Met Cys Phe Tyr Gly Asp Val Lys 
65 70 75 80 

5 Glu Tyr Asp Val Gly Asn Ala Asp Asp Val Leu Ala Tyr Pro Ser Gin 

85 90 95 

Lys Val Cys Ser Thr Pro Gly Thr Thr lie Arg Leu Asn Gly Asp Glu 
100 105 110 

10 

Lys Gly Ser Tyr lie Gin He Arg Tyr Ser Leu Ala Pro Ala Gly Ser 
115 120 125 

Gly Gly Ala Ser Met Gly He He Asn He Gin Asp Glu He Asn Asp 
15 130 135 140 

Tyr Met Lys Gly Met Tyr Gly Ala Thr Ser Val Lys Ser Thr Tyr Asp 
145 150 155 160 

20 Pro Ser Phe Lys Val Phe Asn Glu Ser Val Thr Pro Gin Tyr Asp Val 

165 170 175 

He Pro Thr Glu Pro Val Asn Asn His He Thr Thr Lys Val He Asp 
180 185 190 

25 

Asn Pro Gly Thr Ser Glu Val Thr Ser Thr Val Thr Phe Thr Trp Thr 
195 200 205 

Glu Thr Asp Thr Val Thr Ser Ala Val Thr Lys Gly Tyr Lys Val Gly 
30 210 215 220 

Gly Ser Val Ser Ser Lys Ala Thr Phe Lys Phe Ala Phe Val Thr Ser 
225 230 235 240 

35 Asp Val Thr Val Thr Val Ser Ala Glu Tyr Asn Tyr Ser Thr Thr Glu 

245 250 255 

Thr Thr Thr Lys Thr Asp Thr Arg Thr Trp Thr Asp Ser Thr Thr Val 
260 265 270 

40 

Lys Ala Pro Pro Arg Thr Asn Val Glu Val Ala Tyr He He Gin Thr 
275 280 285 



Gly Asn Tyr Asn Val 
45 290 

Leu Phe Cys Arg Gly 
305 

50 Val Ser He Thr Asp 

325 

Asn Glu Gly Asn Gly 
340 

55 



Pro Val Asn Val Glu 
295 

Tyr Arg Asp Gly Ala 
310 

Leu Ala Asp Tyr Asn 
330 

Val Ala His Phe Lys 
345 



Ser Asp Met Thr Gly Thr 
300 

Leu He Ala Ala Ala Tyr 
315 320 

Pro Asn Leu Gly Leu Thr 
335 

Gly Glu Gly Tyr He Glu 
350 
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Gly Ala Gin Gly Leu Arg 
355 

Asp Asp Asn Gly Arg His 
370 

Ser Leu Ala Pro Asn Val 
385 390 



Ser Tyr lie Gin Val 
360 

Ser lie Pro Lys Thr 
375 

Thr Leu lie Asn Asp 
395 



Thr Glu Tyr Pro Val 
365 

Tyr lie lie Lys Gly 
380 

Arg Lys Glu Gly Arg 
400 



